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MULTIPLE RECTILINEAR PREDICTION AND THE 
RESOLUTION INTO COMPONENTS* 


LOUIS GUTTMAN 
DEPARTMENT OF SOCIOLOGY, 
UNIVERSITY OF MINNESOTA 


It is assumed that a battery of n tests has been resolved into 
components in a common factor space of 7 dimensions and a unique 
factor space of at most n dimensions, where r is much less than n. 
Simplified formulas for ordinary multiple and partial correlation of 
tests are derived directly in terms of the components. The best (in 
the sense of least squares) linear regression equations for predicting 
factor scores from test scores are derived also in terms of the com- 
ponents. Spearman’s “single factor” prediction formulas emerge as 
special cases. The last part of the paper shows how the communality 
is an upper bound for multiple correlation. A necessary and suffici- 
ent condition is established for the square of the multiple correla- 
tion coefficient of test 7 on the remaining n—1 tests to approach the 
communality of test 7 as a limit as nm increases indefinitely while r 
remains constant. Limits are established for partial correlation and 
regression coefficients and for the prediction of factor scores. 


CONTENTS 


I. INTRODUCTION. 
1. The Problems. 
2. The Fundamental Theorem of Factor Analysis. 
3. The Determinantal Formulas for Multiple and Partial Cor- 
relation. 
II. THE PREDICTION OF TEST SCORES. 
4. The Evaluation of R and of Principal Minors. 
5. The Evaluation of Asymmetric Cofactors. 
6. The Alternative Evaluation of Principal Minors. 
7. The Coefficients. 
III. THE PREDICTION OF FACTOR SCORES. 
8. Two Theorems. 
9. The Regression of a Common Factor on the Remaining 
Common Factors and the Tests. 
a.) The multiple correlation coefficient. 
b.) The regression coefficients of the tests. 
c.) The regression coefficients of the remaining com- 
mon factors. 
*I am indebted to Professor Dunham Jackson for helpful criticism of most 
of this paper. 
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d.) Résumé of the coefficients. 

10. The Regression of a Common Factor on the Tests Alone. 

11. The Regression of a Unique Factor on the Tests Alone. 

12. Compact Derivation of the Regressions on Tests of Tests 
and Factors. 

IV. LIMITS OF CORRELATION. 

13. The Communality as an Upper Bound for Multiple Corre- 
lation. 

14. The Communality as a Limit for Multiple Correlation. 

15. A Limit for Partial Correlation and Regression Coeffici- 
ents. 

16. Limits for the Prediction of Factor Scores. 

17. Perfect Prediction. 

18. Improving Multiple Correlation. 


I. INTRODUCTION 

1. The Problems. Component or “factor” analysis derives its 
inspiration and impetus from the quest after physically meaningful 
coordinate axes. It is of interest and of practical advantage, how- 
ever, to inquire also into the relationship between component analy- 
sis and the problems of multiple and partial correlation. It is clear 
at the outset, from the invariance of the correlation matrix under 
transformation of the coordinate axes, that this relationship does not 
depend upon any particular interpretation or location of these axes. 
Therefore the results of our inquiry will apply with full generality to 
any system of components derived by any method of analysis which 
reproduces the observed correlation coefficients. I refer especially to 
those techniques which take into consideration the possible existence 
of a unique factor in each variable. Only orthogonal axes will be con- 
sidered. 

Part II of this paper shows how multiple and partial correlation 
coefficients and regression equations can be computed directly from 
the components. If 7 is the number of “common factors,” then the 
computations involve essentially only determinants of order 7-+-1. 
Thus, if the number of common factors is small compared with the 
number of tests in the correlation matrix, it may prove economical to 
compute multiple and partial regressions by first resolving the tests 
into components. 

Part III develops regression equations for predicting scores of 
individuals on the factors.* The formulas are again directly in terms 


*For previous attacks on this problem see references 3; 4; 6; 7; 10; 11; 
12, p. 228; 13. Spearman’s pioneering work is in (9). 
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of the components, and involve essentially only rth order determi- 
nants, making computations a minimum. Spearman’s formulas for 
his best “team” of tests, to measure his general factor, emerge as spe- 
cial cases. 

Part IV establishes limits for multiple and partial correlation 
from knowledge of the components. 

Compact numerical methods for computing regressions for all 
tests and for all factors simultaneously will be exhibited in a future 
paper. 

2. The Fundamental Theorem of Factor Analysis. The compo- 
nent analyst is confronted by the nth order Gramian matrix of the 
intercorrelations of n variables. For convenience let us call the vari- 
ables “‘tests.”” Each test has been given a sample of N individuals and 
may be regarded as a vector in the N—1 dimensions specified by the 
scores of the individuals measured from their means as origin (5). By 
any of several methods, the test vectors can be resolved into orthogo- 
nal factor vectors. The present case of greatest interest is that where 
the vectors are resolved into components in a common factor space of 
r dimensions and a unique factor} space of » dimensions, the unique 
factor space being orthogonal to the common factor space, and where 
r is much less than n. 

The matrix of the intercorrelations of the n tests is 


| 1 fs Tis °** Tin 
ot p ee a 2 

R= | 21 23 2n 
| Tnr Tne Yrs ya 1 


The matrices of the components of the n test vectors in the common 
factor space and the unique factor space are respectively 


{! 





| Gir Ayo es Ai, | | Uy 

| a | | | 

id ER cea | 
| | | | 
| Ani Ano °** nr | H Un | 


Test and factor scores are assumed to be in standard forms: with 
zero means and unit variances. It can easily be shown that the aj, and 
the u; are the correlations between test j and the respective factors. 


+ A unique factor may be thought of as composed of an error factor and a 
factor specific to the test for this particular battery of tests. However, a given 
analysis into components does not distinguish between these. Therefore, it will 
be useful here to think of a unique factor as a single factor. 
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U is a diagonal matrix of order nm because each unique factor corre- 
lates zero with all the tests except the one which contains it. Each test 
is regarded as a unit vector in at most r+-1 factor dimensions, and the 
configuration of tests lies in n+-r factor dimensions. 

If we form the supermatrix 


F=||AU|, 


then the fundamental theorem of component analysis (cf. 12, p. 70) 
is that 


R= FF’, 


the prime indicating the transposed matrix as is customary. It fol- 
lows that 





A’ 
R=||A ui u | =a qu, 

3. The Determinantal Formulas For Multiple and Partial Corre- 
lation. The determinantal formulas* afford a powerful means of hand- 
ling regressions. Since the rest of the paper is concerned almost solely 
with them, it will be useful to review them here. If we let Rj, be the 
cofactor of 7r;, in R, then the square of the multiple correlation of 
variable 7 on the n—1 remaining variables is 


R 


Psincke 


Rij" 


The partial correlation between variables j and k, “holding constant” 
the remaining n—2 variables, is 


a 
V R; Rix ; 
If the variables have unit variances, the regression coefficient of vari- 
able & in the regression of variable j on the remaining n—1 variables 
is 
— 


Rj; 


II. THE PREDICTION OF TEST SCORES 


4. The Evaluation of R and of Principal Minors. For the values 
of multiple correlation coefficients we need to determine R and R;;. 


* A convenient method of derivation is given in (2). 
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Here and throughout this paper, unless otherwise specified, we shall 
assume U to be non-singular, i.e., none of the uniquenesses vanish. It 
will be shown that if U is non-singular, then R is non-singular. The 
case where U is singular will be treated in §17. 

Denote the inverse of U by V. Then 





fod | 
| Un | 
1 
V= | Us 
| 1 
| " 
Form the supermatrix 
2 aa 
D=| q 
| —A U? | 





I is the unit matrix and here is of order r. The order of all such sub- 
matrices will be apparent by inspection of the supermatrix in which 
they occur. 

Premultiply the first row of D by A and add to the second row. 
Then 


I A’ 
0 AA'+ U2) - 





p= 





From Laplace’s development, 
D=|AA'-+U?| =|R] 


Postmultiply the second column of D by V’A and add to the first col- 
umn; then 


| I+ AVA A’ | 
D= | « 
ae. | 


Let 

Q=I+ AVA. 
Q is a Gramian matrix of order and rank r, since it is the product of 
the supermatrix 


||1 AV]| 








80 PSYCHOMETRIKA 


and the matrix transposed. Then 


D = U’Q 
or 
R= U’Q. 


U? affords no difficulty in evaluation. Therefore R can be evaluated 
essentially from an rth order determinant Q, regardless of the num- 
ber of tests. Since U and Q are non-singular, R is non-singular. 

In expanded notation, 














a; 0530; ” Qj; | 

1+ Qj* > 71%j2 > j1 a | 

j=l Uj? jaa Uj jar Uj | 

| 

5 Stn Dj2A jy 1 413 Ajo” ~ Lj2Q jr | 

Q= ja U;? j= Uj” jn U;? | 
. | 

sy tintin 3 oe j2 | 











§=1 U;? =1 j ie | 


Any first principal minor R;; can be similarly treated by forming 
a D;; omitting the components of the jth test from A and U. In gen- 
eral, to evaluate a principal minor obtained by striking out any m rows 
and columns in R, we must evaluate the determinant obtained by 
omitting the m variables involved from the summations in Q and mul- 
tiply that determinant by the product of the uniquenesses (squares of 
the unique components) of the n—m remaining variables. These for- 
mulas, then, enable one to compute any desired multiple correlation co- 
efficient directly from the components. For example, the square of 
the multiple correlation of the first test on the n—1 remaining tests is 


Q 

Q,’ 

where Q, is the same as Q except for the omission of test 1 from the 
summations. 

5. The Evaluation of Asymmetric Cofactors. The determinantal 
formulas for partial correlation and for regression coefficients require 
the evaluation of cofactors Rj, where 7 ~ k. Let b; be the jth row 
vector of A. It has r elements. Let Bj be the (n—2) X r matrix ob- 
tained from A by omitting b; and b, . Let Uj, be the (n—2) X (n—2) 
matrix obtained from U by omitting rows and columns j and k. 

To obtain a formula for R,., form the supermatrix 


1—wu,? 
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| —B. 0 0 U4! 
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Premultiply the first row by b, and add to the third row, and premul- 


tiply the first row by B,. and add to the fourth row. Then 





. | I b’, - b’, B's. 
0 0 1 0 
D,. = 
| 0 b.b’, b.b’, + u,? b.B’s. 
| 0 B,.b’, B,.b’, B,.B’3. + U,? 


It is easily verified that the determinant of the complement of I in the 


above determinant is actually Ri2. Therefore, 


Dy. = Rie. 


Going back to D,. , postmultiply the second column by a b, , the 


third column by be , the fourth column by V,.7B,. , and add these to 


the first column, leaving 











: | Q ym Ws Bs 
— | - a ee 
0 , ws 
0 0 0 U,,? 
or 
Q  »b; 
slid L == Bay. 








82 PSYCHOMETRIKA 


It is seen that in general,* if 7 ~k, 
cio 
ee et 


6. Alternative Evaluation of Principal Minors. For computa- 
tional purposes, it is highly desirable to have a formula for principal 
minors, especially first principal minors, that is analogous to that for 


Rj, above. Let b,, be composed of any m rows of A, and let U,, be 
composed of the corresponding rows and columns of U. 
Form the supermatrix 
| Q On 
Pn == i 

| bn Un? 
Postmultiply the second column by V,,”b,, and subtract from the first 
column. Then 


2. * 
10 UU, 


The subscript of Q,, indicates that the m tests involved have been 
omitted from the summations in Q. Thus we have 


Qn = Vin2P mn . 


But, from §4, if we let R,,,,, indicate the principal minor in R obtained 
by omitting the m rows and columns involved here, then 


Ram — U2Vn?Qm » 


Pm 


Therefore, 
Ran = UV,°P.. 


In particular if we select m as involving only test 7 , the first principal 
minor corresponding to test 7 can be evaluated from 


* The formula may be written also with Q;, Q,, or Qj, substituted for Q, 
indicating that either or both of tests 7 and k may be omitted from the summa- 
tion in Q without changing the value of the determinant. This is so because the 


1 
second column postmultiplied by ror 4 b; may be subtracted from the first column, 
u; 


1 
or the second row premultiplied by bh’; —> may be subtracted from the first row, 
UK 


or both operations may be done. 











~ 
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1 | Q het 


Rj; = —> U? 


U;? 





bu? 


7. The Coefficients. We are now in a position to write explicitly 
the multiple, partial, and regression coefficients: 

(1) The square of the multiple correlation of the jth test on the 
n—1 remaining tests is either 


Q 
1— u;? 
G 
or 
| bi; l 
ee b; =u? | 
(2) The partial correlation between the jth and kth tests, “hold- 
ing constant” the remaining n—2 tests is 





; 








(QQ byF |Q by 
by U;? | b, Ux? 


(3) The regression coefficient of the kth test in the regression 
of the jth test on the remaining n—1 tests is, for unit variances, 











}Q bi 
— u;” 
Db; 0 
- 8) 
UW.” 
| b; U;? | 


The ease with which these can be computed simultaneously, once 
Q has been set up, will be shown in a future paper. If it is desired to 
eliminate any tests from the regressions, it is necessary only to omit 
those tests from the summations in Q. 


III. THE PREDICTION OF FACTOR SCORES 


8. Two Theorems. The derivations will be simplified by the use 
of the following two convenient theorems. Let a,, be the »Xm matrix 
composed of any m columns of A, and let A,, be the nX (r—m) matrix 
composed of the remaining columns. Let §,, be the (n-+-m) X (n+m) 
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supermatrix constructed by “bordering” R with a,, in the following 
manner: 


| I oe 


Premultiply the first row of S,, by a, and subtract from the sec- 
ond row. Then 


Sm = | R—ana’'n | = | AnA’, + U? |. 
Similarly, if u, is composed of any m columns of U, and if we form 


i! I Ww», 
T,, = || . 
| Un R | 
then 
i” — | R — a,u'n, | . 


The m uniquenesses have been subtracted from the diagonal of R. 

These results may be stated as 

Theorem 1. “Bordering” (in the above sense) the correlation 
determinant by components on any set of orthogonal coordinate axes 
is equivalent to subtracting from the correlations the contributions of 
these components. 

Since A is of rank 7 , it can be shown that AA’ is of rank 7 . There- 
fore all minors of order 7-++-1 in AA’ must vanish. This means that all 
minors of order r-+-1 in R that do not include a diagonal element must 
vanish. If we border R with r+-1 sets of unique components, by the- 
orem (1) we are doing the equivalent of subtracting r+1 unique- 
nesses from the diagonal of R. This leaves a swath of r+-1 rows (and 
columns) in R in which all (r-+-1)-order minors vanish. Therefore, 
by Laplace’s development according to this swath, the “bordered” R 
must vanish. Similarly, if R is bordered by r—m sets of common fac- 
tor components and m-+1 sets of unique components, it will vanish. 
This gives the 

Corollary. A correlation determinant vanishes if it is “bordered” 
by r+1 sets of components. 

We now desire the theorem for evaluating S,,. Let 


| I A'» | 
| An UF | 
Premultiply the first row by A,, and add to the second. Then 


D,, = 


Dm = | AwA’n + U? | = Sn. 
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Postmultiply the second column by V’A,, and add to the first. Then 
Dy = Sm = U? |1-+ A'nV?Ap | « 


Thus we have 
Theorem 2. The (n+m)-order determinant S,, may be evaluated 


from 
U? |1-+A’,,V°A,, | . 
which is essentially a determinant of order r—m. 
By noticing that 
A,,A’, + U? = U?(V°A,,A’,, +1) , 
we see that we have shown that 
[1+ V'AwA'n | = [1+ A’snV*An | « 


The left member is an nth order determinant; the right member is 
an rth order determinant. The left member may also be written 


[Tt AnA'nV? | . 


The derivation in Part II of the formula for R is the special case 
of theorem (2) where A, = A. 

9. The Regression of a Common Factor on the Remaining Com- 
mon Factors and the Tests. 

a.) The multiple correlation coefficient. The square of the mul- 
tiple correlation of the first common factor on the r—1 remaining 
common factors and the v tests is (cf. §3) 


where 
| I A’ | 


Ss = \ | | e 
A R/| (A R | 
S is the matrix of the intercorrelations of the n-+-r variables of tests 
and common factors. From theorem (2), 
S=U?’, S,, = U? |1I-+ a',V’a, |. 


It will be recalled that by an a is meant the kth column vector of A. 
The value of S;, is U? times the determinant of order one, or the plain 
number, 





n Q;;? 


1+ 


jer? 
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Therefore the square of the multiple correlation coefficient is 
1 


143% 


> 
j= Uj 


' 





This formula looks familiar. It strongly resembles Spearman’s for- 
mula for the correlation between his “general” factor and the best 
weighted “pool” of tests (9). If there were but a single common fac- 
tor in R, that is, if A were of rank one, there would be no other com- 
mon factors for us to include in the regression with which we are 
working. The “independent” variables in the regression would be 
only tests. Also, it would follow that 


U;? = 1 —;;? . 


Therefore, our formula reduces to Spearman’s formula in the case of 
a single factor. If there is more than one common factor, our formu- 
la implies that the additional factors are included in the regression. 

b). The regression coefficients of the tests. For the regression 
coefficient of the first test, we need the cofactor of a,, in S. Let us 
write it as 


0 0 i, 
&,= | 0 I a 
| ay A, R 


where L, is the row vector of n dimensions (1 00--- 0). Postmultiply 
the first column by a’, , the second by A’, , and subtract from the third 
column. Then 











0 L. 
0 L, 
su = I 0 = = —4,,U;’ 
| a, U2 | 
oe 
Let 
n Osx? 





%=W/(1+> ). 


= U;* 


The desired regression coefficient is (cf. §3) 


As before, when there is but a single common factor, this for- 
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mula reduces to Spearman’s regression coefficient. It may be noted 
that this method of derivation gives the proportionality constant c; 
explicitly. 

c). The regression coefficients of the other common factors. 
Since the regression with which we are dealing includes the other 
r—1 common factors, their regression coefficients must also be found. 
For the regression coefficient of the second factor we need the deter- 


minant 


0 I, 0 | 
Sz = | 0 I ae 
a, A, R 


where L is the row vector of y—1 dimensions (100--- 0). 

Postmultiply the first column of S,. by a’, , the second column by A’, , 
and subtract from the third column. Then 
0 —a', 


| ay U? 





Si. = 





In this last determinant, postmultiply the second column by V’a, and 
subtract from the first column. Therefore, 


2 wae 
mil a’,V’a, a, a % dj Aja 
| U? | ja Uj? 
The regression coefficient of the second common factor is 


—Sr2 
Su 


d). Résumé of the coefficients. The multiple correlation and 
regression coefficients for any factor or test other than the particular 
ones we have used may be obtained simply by a change in subscript. 
Therefore, we have established the following formulas: 

(1) The square of the multiple correlation of the kth factor on 
the remaining 7—1 factors and the 7 tests is 


=. 





es 6; § SnSm 
j=l 


u;? 


1—c;’. 


(2) The regression coefficient of the /th factor is 


Qj ji 
> J J 
ja Uj? 
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(3) The regression coefficient of the jth test is, for unit vari- 
ance, 

Ajx 
Ck uP e 

If it is desired to remove any tests from the regression, it is nec- 
essary only to remove those tests from the summations over j in the 
formulas. 

10. The Regression of a Common Factor on the Tests Alone. 
The previous section gives the regression of a common factor on the 
remaining common factors and the tests. However, scores cannot 
actually be predicted from such a regression equation unless the exact 
scores on the remaining common factors are known. Since the scores 
of the other factors will in general have to be predicted from regres- 
sion equations, with errors of estimate, they will not be known ex- 
actly. Therefore, the above regression is not serviceable in practice, 
except in the case of the single common factor. Our present problem 
is to derive the regression of any factor on the tests alone, for test 
scores will be known quantities. 

The correlation matrix for the regression of the kth common fac- 
tor on the n tests is the (n+1)-order matrix 

| 1 a’; 
S. ea | 
a, R 
From theorem (2), 
S; = U? |1-+ A‘,V°A, |. 


The determinant on the right is order r—1. 
The square of the multiple correlation coefficient is 





1 Se _, |1-+ A‘,.V?A, | - 
—y=1— D - 
The regression coefficient of the jth test is — = times the co- 


R 
factor of a; in S,. This cofactor may be obtained by substituting a, 
for the jth column of R (or a, for the jth row of R), evaluating the 
resulting determinant of R, and attaching a minus sign to the an- 
swer. Expanding this determinant according to the elements of a, , 


and multiplying by 4 (the minus signs cancel), gives the regression 


coefficient of the jth test: 
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1 
z (QR; + Goo; +--+ + OnE) « 


This is the formula, given in a different fashion, of Thurstone and 
Thomson* (10; 12, p. 228). 

From Part II of this paper, we know how to evaluate R and Ry 
from Q and determinants of order r+1. Writing the above expres- 
sion in terms of these determinants and collecting terms, we find that 
each regression coefficient collapses into the simple ratio of essen- 
tially two rth-order determinants. For example, the regression co- 
efficient of the jth test for predicting the first common factor is 























a; > Qj Ajo 5 Qj1Aj3 i 5 471A; | 
a7 : UW; - U5? ‘: 7 U;? 
j=1 7 j=1 7 j=1 7 
mR ” Qj2” s Qj2Qj3  Aj2hjr | 
———e | Aj2 1 Ss = 2 ~ 2 ies 2 | 
Uj Q j=l Uj j=l Uj; j=1 U; e 
n 0;,0; n Q; A; " Q,;,2 
grj2 7rnj3 jr 
| Qe = 2 sso 
| ja U/ ja U; ja U;? 





From these regression coefficients, we may obtain a new evalua- 
tion of the multiple correlation coefficient. It is well known that the 
square of the multiple correlation coefficient is equal to the product 
sum of the regression coefficients and the corresponding zero-order 
correlation coefficients. Since the factor components are the correla- 
tions between the factors and tests, we may obtain the multiple cor- 
relation coefficient by such a process. Thus, the square of the multiple 
correlation of the first common factor on the tests alone is 
































5 Q;;? 5 Aj1Ajo 3 Aj, A;3 3 Qj19;, 
j= Uy" jaa Uj? jn Uy? ja U;? 
* AjeAjx 2 Aja” * AjoAjs ” Qj2Qjr 
Li? 1+3 > ree = 
Q |" u;? jaa Uj? ja UP? ja U;? 
3 Qj ,Qjy 3 Qj ,Ajo 3 A; Djs 1 re \ Q;,? 
jn Us? ja Uy” ja U;” jar Uj? 





The square of the multiple correlation of the kth common factor on 
the tests alone is similarly obtained by omitting unity from the kth 


* Holzinger (4) does not state the formulas but applies the Doolittle method 
to S,. This is equivalent to applying the formulas. 
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diagonal element of Q and dividing the resulting determinant by Q. 

The formulas of this section reduce to Spearman’s formulas in the 
case of a single common factor. If it is desired to remove any tests 
from the regressions, it is necessary only to omit them from the sum- 
mations over 7. 

11. The Regression of a Unique Factor on the Tests Alone. Com- 
ponent analysts are ordinarily concerned with the common factors, 
but it may in some cases be of interest to predict scores on the unique 
factors. Godfrey Thomson has given the necessary regression equa- 
tions (10). They will be derived here without going back to the least 
squares process, and then expressed in terms of Q and determinants 
of order 7+-1. 

For the regression of the first unique factor on the n tests the 
correlation matrix is 

T, = | : ws I 
| R | 


From theorem (1), 7, is obtained from R by subtracting u,? from the 
first diagonal element. Therefore, 


‘gs = R = u,*R, ° 


The square of the multiple correlation coefficient is 


T, R—u?Riy, _ = Ri, 
PSM Co eae > ee 
The regression coefficient of the-jth test is seen to be 
Ps R,; 
U; ae ‘ 


(The square of the multiple correlation coefficient is simple wu, times 
the regression coefficient of the first test because the first unique fac- 
tor correlates zero with all tests but the first.) 

In general, the square of the multiple correlation of the kth 


unique factor on the x tests is 


The regression coefficient of the jth test for predicting the kth unique 
factor is 


Rij 


‘ae * 
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From Part II, we know how to evaluate these expressions from 
Q and determinants of order r-+1. Thus, the square of the multiple 


correlation coefficient of the kth unique factor on the n tests is 
Qk 
Q 


or 
shoe | oe 
W?7Q | be We 
The regression coefficient of the jth test is 
1 | Q b’;, | 
Q | b, Oo]. 


Uj Ux" 


If it is desired to omit any tests from the regression, it is neces- 
sary only to omit those tests from the summations in Q. 

12. Compact Derivation of the Regressions on Tests of Tests 
and Factors. If wj is the regression coefficient of test 7 in the regres- 
sion of the common factor k on the n tests, and if 


W= || Wik || ’ 
then the results of §10 may be written in the compact form: 
Ww’ =e A’R- , 


Now 
R= AA’+ UW. 


Premultiply both sides by A’V? : 
A’V’R = A'V°AA’ + A’ 


= (A’VA+ DA’ 
= Qa’. 
Therefore, 
A'V? = QA’R" 
and 
Q"A’'V? = AR". (1) 


This last equation gives the regression coefficients from rth-order de- 
terminants.* 

For the regression coefficients in the regression of a test on the 
remaining tests we need the inverse of R. Premultiply both sides of 


* Equation (1) appeared in (6) in different notation shortly after the first 
draft of the above was written, and subsequently in (7). 
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(1) by A: 


AQ'A'V? = AAR". 
Add U?R- to both sides: 


AQ?A’V? + U?R> = AA'R+ + UR 
= (AA’+ U2)R2 


= BR" =f. 
Therefore, 
R“ = V?(I— AQ"A'V?) . (2) 


The coefficients of §3 are readily obtained from the right-hand mem- 


ber. 
The regression weights of tests for the prediction of unique fac- 


tor scores were shown in §11 to be obtained from 
UR”. 


Premultiplying both sides of (2) by U therefore gives the regression 
coefficients of the tests for predicting unique factor scores: 


UR- = V(I— AQ’'A'V’) . (3) 


IV. LIMITS OF CORRELATION 


13. The Communality as an Upper Bound for Multiple Correla- 
tion. The communality of a test is defined by Thurstone as the sum 
of the squares of its common factor components. The communality 
of test 7 is denoted by h;?. The total variance of test 7, taken as 
unity, is the sum of the communality and the uniqueness: 


h? + uj? = 


The communality thus gives the proportion of the total variance due 
to the common factors. The central role played by the communality 
in multiple correlation theory has already been indicated by Merrill 
Roff (8), and subsequently by Dwyer (1). 

From §7, the square of the multiple correlation of the jth test on 
the n—1 remaining tests is 


» @ 
1— 4; Z,* 


Q and Q; are non-singular since the hypothesis that U is non-singular 
makes R and R;; non-singular. We need to show that Q > Q;. 


Now, 
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b’ 
qa 5 | 
Q)28 
re 
a. | 


, 


Postmultiply the first column by —Q> a! and add to the second. Then 
ij 


Q; = Q(1— 4) ’ 
where 
é= b; 1b m 
Uj; Uj; 


Since Q is Gramian and non-singular, there exists a real, non- 
singular matrix M of order r such that 
Q= MM’. 
For such an M, 
Q: = (M")M". 
Therefore 6 is a Gramian matrix because it is the product of the ma- 
trix Me (M-')’ and the matrix transposed. Since M” is non-singular 


the rank of @ is that of b;. Excluding the trivial case where b; = 0, 
b; is a row vector of rank one; and @ is therefore of rank one. From 
the multiplication involved in its formation, @ is of order one. Hence 
6 is a non-singular Gramian matrix of order one, or 6 is simply a posi- 
tive scalar number. Since Q; > 0, we have 


0<d<1l. 
Therefore, Q; is something less than unity times Q, or 


Q 
Q; 


and the square of the multiple correlation coefficient of test 7 is less 
than the communality. 

In the trivial case where b; = tl test 7 correlates zero with all 
the remaining tests. 

14. The Communality as a Limit for Multiple Correlation. A 
problem which Merrill Roff indicated but did not solve in the general 
case was to prove that, if 7 remained constant, the square of the mul- 
tiple correlation coefficient approached the communality as a limit as 


> 
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n tended to infinity. The conditions under which this is true may be 
shown from the machinery of the previous section. The problem is to 
find the conditions under which there exists 

| 

lim—>~=1. 

om Q; 
From §13, it is clear that a necessary and sufficient condition is that 
there exist 

limé=0. 


no 


Let C be the inverse of the matrix of the diagonal elements of Q: 


C2 


C= | ; 


|| 


| es 
| 
| 
| 
' 1] 
Cr || 


The c;, have the same meaning as in §9b. Let B; = be C: , and let 
j 
Q = C'QC'. Then 
6 = BQ" f’;. 


Now each element in the principal diagonal of Q is equal to unity. 
Therefore, the determinant Q is not greater than unity, because the 
determinant of a Gramian matrix cannot exceed the product of the 
elements in its principal diagonal. Furthermore, the absolute value of 
each element in Q has unity as an upper bound, because the determi- 
nants of all second-order principal minors in Q are positive for all 
sufficiently large. Hence, Q-' = 1 for all n sufficiently large, or Q> is 
bounded from becoming singular. If there exists a 6 > 0 such that 


7 = 1 
Q = 6 for all n sufficiently large, then Q? = 3 for all sufficiently 


large, and the absolute value of each element in Q" has an upper 


bound. 
It is then seen that if there exists a 6 > 0 such that 


C:QC! | =>4 
for all n sufficiently large, a necessary and sufficient condition for 
there to exist the limit zero for @ is that 
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lim Bf; = 0. 
That is, if there exists such a 6 > 0, then a necessary and sufficient 
condition for the square of the multiple correlation of test j to ap- 
7. the communality of test 7 as a limit as n tends to infinity is 
that 

limbC=0. 

Since b; is constant with respect to  , the condition is that all c, 
which correspond to non-vanishing factor loadings of test 7 must van- 
ish in the limit. This means that the tests that are added to the regres- 
sion as ” increases must continue to have appreciable components on 
the common factors present in test 7. 

15. A Limit for Partial Correlation and Regression Coefficients. 
The sufficiency of the condition in §14 could have been shown by ref- 
erence to the second formula for the multiple correlation coefficient in 
§7. The Q could be premultiplied and postmultiplied by C, and the 
determinant whose inverse is indicated could be premultiplied and 
postmultiplied by the determinant of 


co | 
0 1| 


without changing the value of the coefficient. Then if C'QC is bounded 
from becoming singular, the sufficiency is immediately apparent. 

A similar treatment of the formulas in §7 for partial correlation 
and regression coefficients, with the proper use of C, shows that if 
there exists a 6 > 0 such that 


| C'Qc | > 


for all n sufficiently large, a sufficient condition for the coefficients to 
tend to zero is that either b;C or b,C tend to zero. 

It is interesting to note that if the multiple correlation of test 7 
tends to its communality as n becomes infinite, and if 


| C:QC!| = 


for all m sufficiently large, then all the regression coefficients in this 
regression must tend to zero. (Also, all partial correlation coefficients 
of test 7 with the other tests, the regressions being of an order which 
tends to infinity with n, must tend to zero.) Since the square of a 
multiple correlation coefficient is the product sum of the regression 
coefficients of the “independent” tests and the correlations of these 





C= 
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test with the “dependent” test, and since these latter correlations are 
constant with respect to ”, it follows that the communality is the 
limit of the sum of n—1 terms, each of which has the limit zero, as n 
becomes infinite. 

16. Limits for the Prediction of Factor Scores. The following 
results are either found immediately from §9d, §10, and §11, or can 
be established in a manner similar to that of §14 and §15, so they will 
be stated without proof. 

As n becomes infinite, the condition that c;, tend to zero is 

(1) necessary and sufficient for unity to be the limit of the 
multiple correlation of the kth common factor on the r—1 remaining 
common factors and the n tests; 

(2) sufficient for zero to be the limit of any regression coeffi- 
’ cient in the regression of (1). 

If there exists a 6 > 0 such that 


| C}QC! | =a 


for all n sufficiently large, then the condition that c, tend to zero is 
(3) necessary and sufficient for unity to be the limit of the 
multiple correlation of the kth common factor on the tests alone; 
(4) sufficient for zero to be the limit of any regression coeffi- 
cient in the regression of (3). 
If there exists a 6 > 0 such that 


|c'gc'|>0 
for all n sufficiently large, then the condition that the limit of b,C be 


zero is 

(5) necessary and sufficient for unity to be the limit of the 
multiple correlation of any unique factor on the n tests alone; 

(6) sufficient for zero to be the limit of any regression coeffi- 
cient in the regression of (5). 

17. Perfect Prediction. It is clear that the road to perfect pre- 
diction of test scores from test scores must lie in a singular U. Let 
us see what happens to R when U is singular. Suppose only one test, 
say test 1, has no uniqueness. Its communality is unity. Form the 


supermatrix 


} 1 Y.-B 
D=|—b, 0 o |, 
| —B. 0 U,? | 


where the submatrices have the same meaning as in §5. Premultiply 
the first row by b, and add to the second row; premultiply the first 
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row by B, and add to the third. Then 
| I b’, B’, 
D=!0 _ bb b,B’, , 
0 B,b’, B,B’, + U,? 


or 
D=R. 


Postmultiply the third column of D by V,2B, and add to the first 
column, Then 


I ob B’,V,"B, b’, B, 


p=| —b Ot thd 
0 0 UZ 
or 
plese FMM). 
| ge age 


It is clear that again 
Ry, = U,?Q, . 


Hence, the square of the multiple correlation of the first test on the 
n—1 remaining tests is 


Q, b’; 
11! 
Q: —hb, | 
By simple manipulations, this becomes 
Q* 
2 ci ee 
Q: 


where Q* = Q, + b’,b,. 

Let C, be the same as C in §14 except that the first test is omitted 
from the summations. As increases, let us assume that all addi- 
tional tests have non-vanishing uniquenesses. Then by analogy to 
§14, if there exists a 6 > 0 such that 


| C,9,C,1 | = 6 


for all n sufficiently large, a necessary and sufficient condition for the 
multiple correlation coefficient to tend to unity is that b,C, tend to 
zero. Since the communality is unity, the communality is again the 
limit of the square of the multiple correlation coefficient. 


er ners ve 


tee te re Rr — Scania Lacipaee ee eaoeorter: 


| 
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If more than one unique component vanishes in U, by an exten- 
sion of the above method, similar results are obtained. Let m be the 
number of vanishing uniquenesses, where m < r-+ 1 ; let C,, denote C 
with the m corresponding tests omitted from the summations; and 
suppose that all tests to be added have non-vanishing uniquenesses. 
If there exists a 6 > 0 such that 


| C,,'Q,.C,,# | = é 


for all » sufficiently large, then the condition that b;C,, tend to zero 
is necessary and sufficient for the square of the multiple correlation of 
test 7 to approach its communality as a limit. 

The case where m = r-+1 is equivalent to the conditions of the 
corollary to theorem (1), and R vanishes. Therefore, a necessary and 
sufficient condition for perfect prediction of test scores to be possible 
from a finite number of tests is that r+-1 or more of the uniquenesses 
vanish. 

18. Improving Multiple Correlation. It is found in practice that 
all tests tend to have unique factors, due to errors of measurement 
and to specific factors. Therefore, in practice, communalities will be 
less than unity; and hence multiple correlation coefficients will be less 
than unity. The problem is, what type of tests will help raise the mul- 
tiple correlation coefficient most if added to the regression? The 
analysis presented here enables an answer which conforms to previ- 
ous experience, but gives the answer from a point of greater vantage. 

If the rank of A remains constant as more tests are added, the 
communalities of the original n tests remain constant. Hence, if the 
squares of the multiple correlation coefficients are to attain their max- 
imum for these v tests, it is usually necessary to add tests which have 
substantial loadings on almost all the common factors (from §15). 
In order to keep these additional tests hitting on almost all r cylin- 
ders, so to speak, it is in general necessary for them to have substan- 
tial correlations with all of the n original tests. This helps explain 
the known fact that adding to a regression tests that have substantial 
correlations with the tests already in the regression will not raise a 
multiple correlation coefficient much. The number of common factors 
in such a case tends to remain constant, so that the “dependent” test’s 
communality tends to remain constant, keeping the multiple correla- 
tion coefficient from increasing much. 

If, however, tests are added which correlate substantially with 
the “dependent” test but very slightly with the other tests in the re- 
gression, these new tests tend to bring new common factors in, thus 
increasing the rank of A and increasing the communality of the “de- 
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pendent” test. This raises the upper bound for the multiple correla- 
tion coefficient and enables it to increase more readily. Therefore, if 
it is desired to obtain better estimates of a given test, those tests 
which correlate substantially with that test but correlate negligibly 
amongst themselves are the most desirable as “independent” tests. 
(The predictability of the one test is enhanced at the expense of the 
predictability of the other tests in this case, for the other tests chosen 
by this criterion would have relatively small communalities.) 


11. 


12. 
13. 
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A MODIFICATION OF THE METHOD OF 
SUCCESSIVE INTERVALS 


CHARLES I. MOSIER 
BOARD OF EXAMINERS, UNIVERSITY OF FLORIDA 


A modification of the method of successive intervals is presented 
which yields scale values correlating .995 with those from Thur- 
stone’s method described by Saffir. Values yielded by the present 
method can be obtained in 25 per cent of the time required by the 
older ay and are shown to be, on a priori grounds, more reliable 
as well, 


Thurstone, in his method of successive intervals, has provided a 
psychophysical scaling method which Saffir (2) has presented and 
shown to be the equivalent of the Law of Comparative Judgment. 
While it is computationally simpler than most scaling methods, the 
labor of applying it to any considerable number of stimuli is still 
great. This paper proposes two modifications of the method which 
reduce the labor tremendously with little, if any, loss of accuracy. 

The method of successive intervals is applicable to data judged by 
the method of equal-appearing intervals or other similar methods 
where each stimulus is sorted into one of a number of categories, each 
greater than the last in some definite attribute. 

In its essence the method involves so spacing the categories of 
judgment that the responses to one stimulus will yield a Gaussian dis- 
tribution, and verifying that the responses to the other stimuli will 
also yield Gaussian distributions on the same spacing of the cate- 
gories. Saffir (2) describes the computations for this procedure in 
detail. The method described by Saffir will be referred to as the “long” 
method. If double probability paper be used or constructed, the per- 
centile ranks of the two stimuli may be plotted directly, the values 
read graphically and then converted to the desired origin and unit. 

There are certain disadvantages of the long method which are 
overcome in the modification to be described. One is that each stim- 
ulus is scaled against only one other stimulus with the resulting pos- 
sibility of inaccuracy and distortion, particularly if the hypothesis be 
true for certain stimuli in the series, but not for all. Another is that 
each scale value and o as obtained is measured from a different origin 
and in a different unit and must be converted arithmetically to the de- 
sired common origin and unit. Still a third disadvantage is that even 


—101— 
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for two successive stimuli the overlapping will not be complete, so 
that, while for each stimulus, x-values for five categories may be de- 
termined, only three of these categories will be common to the two 
stimuli and thus certain of the data must be discarded. The method 
to be described overcomes these disadvantages, and, where many stim- 
uli are to be scaled, effects a great saving in computational labor. 

In summary form, the method proposed consists in selecting a 
group of stimuli whose medians are distributed throughout the entire 
range to form a “composite criterion” group. These are scaled by the 
long method of Successive Intervals. From the results of this scaling 
the scale values of each category of judgment, referred to the mean 
and o of the standard stimulus, can be computed. These average scale 
values for each category then provide a scale by means of which each 
other stimulus can be obtained directly in terms of the constants of 
the standard stimulus. 


TABLE 1 
Seale Values of Standard Set of Words, 
Method of Successive Intervals 





“Md. =SV. 





No. Stimulus 
281 Completely unsatisfactory 1.5 0.00 1.00 
4 Very unsatisfactory 2.3 0.75 0.65 
12 Catastrophic 2.5 0.91 0.81 
27 Treacherous 2.4 1.05 0.62 
1 Menacing 2.9 1.14 0.56 
46 Discouraging 3.5 1.42 0.49 
49 Painful 3.6 1.43 0.54 
36 Unprofitable 4.3 1.72 0.62 
30 Rejected 4.6 1.79 0.54 
23 Disputable 5.7 2.42 0.69 
48 Normal 6.7 2.47 1.43 
139 Satiating 5.2 2.79 1.54 
78 Reconcilable 6.3 3.30 0.75 
20 Blameless 7.6 3.64 0.90 
33 Solacing 8.0 3.79 0.51 
40 Ordinary 6.5 3.83 1.43 
21 Bonny 8.4 3.97 0.61 
29 Decent 8.5 4.08 0.61 
3 Preferable 9.0 4.30 0.55 
18 Profitable 9.4 4.40 0.47 
2 Popular 9.7 4.55 0.49 
35 Successful 10.0 4.65 0.54 
34 Sublime 10.3 4.90 0.86 
8 Superior 10.4 4.98 0.54 
17 Completely agreeable 10.1 4.95 0.65 


7 Superb 7251 5.35 0.68 
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In the experiment which led to the development of the method 
(1), three-hundred adjectives had been rated by 150 judges on an 11- 
point scale of favorableness-unfavorableness of judgment with the in- 
structions for equal-appearing intervals. Then 26 words, distributed 
throughout the entire range of stimuli, were selected. These formed 
the composite group, and these words were scaled by following Thur- 
stone’s method exactly, using the stimulus “completely unsatisfac- 
tory” as the standard, with its mean at zero and its o equal to unity. 
The scale values of the composite group are shown in Table 1. The 


TABLE 2 


Seale Values of Upper Limits of Successive Categories, 
Method of Successive Intervals—“Completely Unsatisfactory” Standard Stimulus 


Mean S.D. of S.E. of 








Interval S.V. S.V. S.V. Size of Category 
1 0.64 .09 .037 ie 
2 1.20 05 015 .56 
3 1.60 .09 .026 40 
4 1.95 .03 .009 42 
5 2.64 .03 .013 62 
6 3.36 07 025 72 
7 8.85 \ .08 .026 49 
8 4.21 07 .019 36 
9 4.66 .06 .016 45 

10 5.26 01 004 .60 
11 weg ase oe 





scaled distance of each category in terms of the standard stimulus 
could then be obtained from each stimulus whose distribution over- 
lapped that category. If x is the distance of a point on the continuum 
(a category limit) from the mean of any stimulus s in uits of o, , then 
y , the distance of that category limit from the standard stimulus, u , 


in units of o, , is given by* 





M, —- M, & s 
yo SS, 
ou Ou 
or since, M, = 0,o,=1, 
y=M, + Xo,. 
The scale values of each category, 1, 2, 3, --- , 10, in terms of the 


standard stimulus were computed for each stimulus in the composite 
criterion whose distribution overlapped that category within the lim- 
its of reliability (percentile values between 5 and 95). From 6 to 14 


* From Saffir’s equations 6 and 7 (2). 
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values were found for the scale value of each of these category limits. 
The mean scale value for each category was then computed, together 
with its «. These values are given in Table 2. 

A great saving in labor with little loss in accuracy may be ef- 
fected if graphic methods are used to scale the remaining stimuli. The 
abscissa is laid off in units of the scale above and below the chosen 


WORD STUDY 
METHOD OF SUCCESSIVE INTERVALS 
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Frequency ee ee eee, ee ee ee 2 
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Number: 91 Word: Appropriate 
Scale Value: 4.00 

Sigma: 0.55 
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2 


13 
99 
Sigma: 0.55 


30 
90 


34 46 

14 38 69 
Word: Appropriate 
Seale Value: 4.00 


21 


SCALE VALUE 


91 


Frequency 
Percentile 
Number: 





origin. Then the position of each of the categories is marked off on 
the abscissa in terms of the scale values just found. The ordinate is 
laid off as in the conventional probability paper. A computing graph 
for the results given in Table 2 is shown in Figure 1, which represents 


VI VII VIII X& 


Vv 


It IIr Iv 
Categories of Judgment 


I 


Figure 2 
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the procedure of scaling the stimulus word “appropriate.” The writer 
has secured excellent results from mimeographed charts. In using the 
graph the percentile values of the stimulus distribution are plotted as 
ordinates on the graph using categories of judgment as abscissae, and 
a straight line drawn by inspection. If a satisfactory straight line 
cannot be drawn by inspection, the basic hypothesis is not verified 
for this stmulus and the method of successive intervals is not appli- 
able anyway. 

The x-intercept read from the scale of units at the top is the 
scale value of the stimulus. The reciprocal of the slope of the line is 
the standard deviation of the stimulus. The computation of scale val- 
ues and o’s may be further pictured in Figure 2. A section of the 
graph is cut out and mounted on a card, with the scale values plainly 
marked on the upper edge, and the values for o on the lower edge. 
After the line has been drawn, this graph-section is laid parallel to 
the x-axis with the upper right corner touching the line. The y-axis 
cuts the upper edge of the card at the scale value and the line cuts 
the lower edge of the card at the value for «. The method just de- 
scribed will be referred to hereafter as the “short” method. 

An idea of the saving in time of the short method over the long 
method described by Saffir can be gained from the following compari- 
son: The same ten stimuli were scaled by each method by each of two 
computers. On the basis of these results, the estimated time to scale 
300 stimuli after the percentiles have been computed is 25 hours by 
the long method; by the short method (including the time for scaling 
a composite criterion group of 30 stimuli and determining the scale 
values of the categories of judgment) the time is only 8 hours. 

As a test of the method, 37 words (other than those forming the 
composite criterion which had been scaled by the graphic method de- 
scribed above) were put with the standard stimulus and scaled by the 
long method of successive intervals, all computations being done 
arithmetically rather than graphically, precisely as described by 
Saffir (2). The 37 stimuli were selected so as to be evenly spaced 
throughout the entire range of stimuli. Since none of them was used 
in determining the spacing of the categories of judgment, the values 
determined by the present method and those obtained by Saffir’s meth- 
od are computationally independent. When these two sets of values 
were obtained, the coefficient of correlation between them was .995, 
a value which indicates substantial identity between the two methods 
and completely justifies the use of the shorter method. 

Certain considerations, furthermore, indicate that the shorter 
method yields the more stable values. It would seem reasonable that 
the mean scale value of a category limit determined from a number 
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of stimuli would be more stable than the value determined by a single 
stimulus (as in the long method). In the short method, more observa- 
tions can be utilized as the basis for scaling since the long method 
must discard observations for categories that do not overlap in two 
successive distributions. (For the 37 stimuli the long method used an 
average of 3.3 observations, the short method an average of 3.8.) 

The method here proposed is thus seen to overcome the disadvan- 
tages of the original method; it yields results which are identical with 
those obtained by the longer method to the extent of a correlation co- 
efficient of .995; it effects tremendous savings in time of computation; 
finally, the results obtained are, on a priori grounds, more stable than 
those dependent on the short method. 
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AN APPROXIMATION METHOD FOR OBTAINING 
A MAXIMIZED MULTIPLE CRITERION 


ROBERT J. WHERRY 
UNIVERSITY OF NORTH CAROLINA 


A new approximation method for obtaining a maximized multiple 
criterion, based on the formula of Edgerton and Kolbe (2), is pre- 
sented. By applications to examples from the literature (1, 2), the 
new method is evaluated in comparison with the Horst approxima- 
tion (1), a suggested revision of the Horst procedure, and the more 
exact but more laborious iterative method for the principal axis solu- 
tion of Hotelling. 


Horst (1) and Edgerton and Kolbe (2) have both presented ar- 
ticles which enable one to obtain a maximized multiple criterion. 
While the methods were achieved by different theoretical approaches, 
the final results were identical except for notation. 

Horst, in concluding his article, pointed out that the true solution 
was quite laborious if the number of variables was at all large. He 
then presented a very simple approximation method. His formula for 
the approximation weight of a given variable was 


We=Z0, (1) 


which, being relative already, might be made to approach the true 
weight, obtained by a solution of the characteristic equation of the 
matrix and a consequent Doolittle solution, by dividing any one of 
the weights (that of the one selected as a criterion variable in the 
Doolittle solution) into all of them. A three-variable example was 
presented. 

Edgerton and Kolbe have shown, however, that the Horst ap- 
proximation technique was not very effective since it yielded weights 
much opposed to the true weights for a five-variable problem used to 
illustrate their method. 

It is the purpose of this paper to present a new approximation 
technique which will not be subject to this criticism. The writer has 
chosen the Edgerton-Kolbe formula as the basis for the approxima- 
tion technique. 

The first step is to find the approximate value of B in the follow- 
ing matrix so that R = 0: 
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mB —ti - —t*um Tin | 
| 
—1,5 m ! B Pe —f*s eer | 
[ee pteniat (2) 
| 
—f 15 —Ton eee —T nn m-+ B | 


If we assume that all values of 7;; are equal to #, the mean value of 
all the r;,’s in the matrix, our R becomes: 


m+ B —f tt ef — 
| —?F MiB» —F —7 
R= F ee eis oa oh leet ost a Gere” (3) 
| 
ant — -. =F we | 


If we then (1) subtract the last column from each of the remaining 
columns and then (2) add the first m rows to the nth row, we obtain 


mtBt?r 0 tee 0 —?r | 
0 mt+Bt?7.--- 0 --P 
OP. s « OU ep Qageaginnhoeg 
0 0 + m+B+fr —?r (4) 
0 0 " 0 M+ B—mi| 
Expanding this determinant yields 
R= (m+ B+ #)"™(m+ B— mr). (5) 


When these factors are set equal to zero, it is seen that there are 
m equal roots with a value of —m— and one distinctive root with a 
value of —m(1—7). Now the Edgerton-Kolbe method demands that 
root which is nearest zero, and it can be seen that when the average 
value of r(7) is positive this unique root is the one which is wanted. 
If # were negative, however, the method would require the use of the 
multiple root, and would lead to possible error because in actuality 
these roots, while close to each other, would not be equal since the 7’s 
would not be equal. It then would appear to be necessary, in order to 
secure a unique solution, to reverse the scoring for those variables 
whose intercorrelations were predominantly negative in matrices con- 
taining an excess of negative 7’s. This will have the further advantage 
of making possible the use of a single approximation formula to handle 
all types of matrices. 

To solve for the proper weights we must further obtain a solution 
of the first m variables in terms of the last variable. We must first dis- 
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tinguish between the values of 7;, in the last column which we have 
arbitrarily chosen as our criterion column, the mean of which we shall 
call 7,, and the remaining values of 7;;, whose average value we shall 
call 7,. 

If then we substitute our value of B, namely —m(1—7?), and 
make assumptions of equality for the two types of correlations as 
suggested in the above paragraph, our original matrix takes the form: 


| mr —Tr;: —r, —Tr n | 
—r, mr —r, —ry | 
R= : 2 (6) 
—r; —r,--+ mr —Try | 
| 
| —r n —rn —rn mr | 


Next we solve for the weights for the first m variables in terms of the 
nth, and obtain (after Reitz), omitting the second column, 





| MP —fp:++ —I; T, 
| 
: 
—; age ST Tn 
—| —, MP +--+ —Tr; Tn 
| 
| —r; —r,+++ mr Tn | 
W,= : (7) 
| mr —f’, — Ms —!, 
| ——Ip Mm? - —r, —r, 
| : 
|; —T, —r,; mr —!r, 
—r, ff, —, mr 


If in the numerator of equation (7) we: (1) subtract the next to 
last column from each of the preceding columns, and then (2) add all 
of the other rows to the last row; and if in the denominator we: (1) 
subtract the last column from all of the rest, and then (2) add all 
other rows to the last row, we obtain: 
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mr+r, 0 0 wee 0 —", Vs 
| 0 0 0 —, Tn 
-— 0 mrtr, 0 0 —f', Tn 
0 0 0 “++ MP+Tr, olf, Tn 
0 0 0 oes 0 mr—(m—l1)r, mr, 
Ne ieee hcl thes cate an mad pnd ngeacnerntcaonila 
imrtr, 0 0 tee 0 0 —r, 
0 mrtr, 0 tee 0 0 —r, 
0 0 mrtr, -:- 0 0 =o, 
0 0 0 tee 0 mr, —r, 
0 0 0 wee 0 0 mr—(m—1)7, | 


(8) 
expansion of which gives 
— (mr + 12)? Ty (—mr — 7;) (9) 
(mr + 7r,)™ (mr — (m—1)72) ’ 





7.= 
which reduces to 


—_ Tn 
w= mr — (m—1)r, ’ (10) 





where in order that the W’s vary we substitute the value of 7., for 
each variable in place of the average value 7,. The author also thought 
of substituting for r, some value more specific for the variable in ques- 
tion, but found that this resulted in poorer prediction. 

Fortunately the two problems which have appeared in the litera- 
ture worked by these methods, those of Horst and Edgerton-Kolbe, 
are respectively examples of the cases where the average r is positive 
to begin with and where the signs must be reflected for certain vari- 
ables before that condition is obtained. 

In the example given by Horst we have a three-variable problem 
in which all of the intercorrelations are initially positive. This per- 
mits the immediate application of formula (10). The matrix is 


1 2 3 
1 333 194 
2 393 .298 


3 194 .298 
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and from this table we obtain the values 


mr = .544, 
(m—1)7r, = .333,, 
Tm = .194, 
Tua. = 288, 


which are the values to be substituted in equation (10). 

In the following table the results from the use of this equation 
are compared with those obtained by actual solution of the problem. 
They are also compared with those obtained by the Horst approxima- 
tion method and by an application of Hotelling’s iterative method for 
the principal axis solution. The various methods are evaluated in 
terms of Chi-Squared and P on the basis of their agreement with the 
correct weights. 





1 2 3 Chi- P 
Correct values 1.090 1.094 1.000 Squared \ 
Horst Approx. 1.030 1.197 1.000 .012 .93 
Wherry Approx. .919 1.365 1.000 .050 .83 
Hotelling 1.069 1.175 1.000 .001 .99 


From the above table it is seen that, while all three of the methods 
have P values which indicate that they bear close resemblance to the 
theoretical (actual) weights, the Hotelling method gave the closest 
approximation. Of course, as Wilks (3) has proved, the Hotelling 
method would give exact weights if enough iterations were used to 
enough decimal points. Here the method was carried through five 


such iterations. 
The correlation matrix used by Edgerton and Kolbe was: 


1 2 3 4 5 
1 —.049 —.160 .026 .098 
2 —.049 436 —.637 —.480 
3 —.160 436 289 —.151 
4 026 —.637 .289 408 
5 .098 —.480 —.151 408 


Since this matrix is predominantly negative, it must be rectified by 
changing all of the signs in both columns and rows for variables 2 and 
3 before formula (10) is applicable. Making these changes the matrix 


becomes: 
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1 ee a) 4 5 
1 .049 .160 .026 .098 
—2 .049 436 637 480 
3 160 436 —.289 151 
4 026 637 —.289 408 
5 .098 480 151 408 





and from this matrix, using variable 5 as criterion, we obtain the 
values 


Tn = .098 7 = .215 
Tne = 480 mr = .860 
Tns = .151 YT, = .170 
‘408 (m—1)7r, = .510 


which are the values to be substituted in equation (10). After the 
weights were obtained the signs were made negative for variables 2 
and 3, which had been reversed prior to the application of the method. 

In the following table these weights are compared with the true 
weights. Comparisons are also made for the original Horst method, 
the Hotelling (again carried through five iterations), and a revised 
Horst method suggested by Ledyard Tucker. Tucker’s revision con- 
sists of changing the signs as in the present method before applying 
the original Horst formula. 


1 2 3 4 5 Chi- 
Actual weights .230 —1.213 —.399 .999 1.000 Squared P 





Original Horst 1.046 309 1.616 1.241 1.000 15.039 .010 
Revised Horst 624 —1.218 —.682 .834 1.000 902  .824 
Wherry 271 —1.412 —.444 1.200 1.000 097 .991 
Hotelling 251 —1.218 —.434 .978 1.000 .003 .999 


In this example it is clear that the new method is superior to the 
original Horst method and somewhat superior to the Revised Horst 
method. Indeed it is nearly as good as the Hotelling method when car- 
ried through five iterations. 

If we compare the new method with the Horst method (using the 
revised method for the negative 7 case) for both problems, we find 
the average P for the new method to be .91 while the average P for 
the Horst (revised) is .87. In the case of the new formula, further- 
more, we find that it worked best on the larger problem containing 
five variables. In view of the fact that the labor is only slightly in- 
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creased by the new method it seems advisable to use it, especially on 
problems involving a large number of variables. 

In comparison with the Hotelling method, the new formula was 
less efficient for the three-variable problem, but almost equally good 
for the five-variable problem. Of course the Hotelling method if car- 
ried far enough would yield the actual values, but the labor as com- 
pared with the new method would be much greater. Should the new 
method prove equally efficient on other long problems its use will 
greatly facilitate the calculation of weights. The more arduous Ho- 
telling method should probably be used when the number of variables 


is small. 
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ANNOUNCEMENT 


The psychological tests used in a factorial study of the 
perceptual factor (Psychometrika, March, 1938) at the Lane 
Technical High School, in Chicago, may be obtained from the 
American Documentation Institute, care of Offices of Science 
Service, 2101 Constitution Avenue, Washington D.C. Order 
Document 1337, remitting $1.98 for microfilm form, or $18.00 
for photoprints readable without optical aid. 
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COMMENT ON WILSON AND WORCESTER’S 
“NOTE ON FACTOR ANALYSIS”* 


TRUMAN L. KELLEY 
HARVARD UNIVERSITY 


The authors investigate logically and by comparison with multi- 
variate relationships found in geometry and the physical sciences the 
meaning of such factors as might result from an analysis by the prin- 
cipal components method as advocated by Hotelling and Kelley. They 
conclude that there is danger that a mathematically derived factor 
will lack psychological substance, but they offer no criterion of psycho- 
logical substance. There are a number of interesting issues raised by 
their discussion. 

Their mathematical treatment in Section 2+ showing that “Hotel- 
ling analysis would not give back known traits” is more simply under- 
stood geometrically than algebraically. The authors consider a situa- 
tion wherein three tests x, y, and z are totally dependent upon two 
factors, f and g. Starting with loadings a = 6, b = 8; a’ = .28, 
b' = .96 ; a” = .936, b” = .352 the intercorrelations are 7, = .9360, 
Tr, = .8432 , r,- = .6000 . However, we must not be led astray, for we 
have two, not three, independent variables. Thus, if we take z as one 
of the variables and partial it out of the other two, we find that these 
two are now identical, i.e., 7,,.- = 1.000. A three-dimensional situa- 
tion is postulated, but the swarm of observed points in xyz-space are 
coplanar. Since there is only a two-dimensional continuum in the phe- 
nomena, however many dimensions are chosen in the axes describing 
it, let us treat it as a simple two-dimensional problem. 

In this two-dimensional continuum we have a scatter diagram, 
and if we rotate to the major and minor axes we have the variables 
f and g’, with variances 2.594 and .406, as computed by the authors 
following Hotelling’s procedure. The principal component is f’ and 
it has maximum variance. For simplicity of illustration, let us make 


* Edwin B. Wilson and Jane Worcester, Psychometrika, 1939, 4, 1838-148. 

+ The observation, “There seems to be a confusion in this matter on p. 60 of 
Kelley’s Essential Traits of Mental Life,” is gratefully noted. The confusion is 
due to a regrettable slip which is rectified by dividing each y in the first four 
equations by its standard deviation. 
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an assumption, not necessary for the argument, that the points in the 
fg’ plane give a normal bivariate surface. If contour ellipses are 
drawn they will be similar and similarly placed and have ratios be- 
tween major and minor axes of \/2.594 to \/.406. Any one of these 
projected upon another plane will ordinarily project into another el- 
lipse, but it is possible to find a plane such that the projection will be 
a circle; such, in fact, is the case if the projection is upon the fg plane, 
with which the authors started originally. The projection being a 
circle, the correlation, 7;,, is of course zero. This clarifies the rela- 
tionship between variables f,g and variables f’,g’. Furthermore, if 
the original metric given by o, = o, = o, = 1.000 is the starting point, 
then, though both sets are uncorrelated, the f’g’ set have the added 
property of maximal and minimal variance. The analysis yielding f’ 
and g’ did not “give back” f and g , because it was designed to do some- 
thing else, — yield components with maximal and minimal variances. 
That we start with f and g is nothing in their favor unless they can 
be shown to have some demonstrable meaning and the authors make 
no such claim for them. Though the full meaning of f’ and g’ remains 
to be demonstrated, at least one property, that of maximal and mini- 
mal variance, is already demonstrated by the analysis, so they do have 
to this extent (but to no other) a unique claim not shared by other un- 
correlated sets. 

On p. 136 the authors ask, “Why should there be any particular 
significance psychologically to that vector of mind which has the prop- 
erty that the sum of the squares of the projections of a set of unit 
vectors (tests) along it be maximum?” The answer is simple: that 
vector of mind is the one of all possible ones with reference to which 
people show the greatest variance (see elaboration in next para- 
graph). In other words, it is a trait in which there are glaring indi- 
vidual differences, not trivial ones. Let us not read anything more 
into it than just this. The present writer maintains that, lacking rea- 
sons to the contrary, this is ample reason for considering this particu- 
lar vector important. He has mentioned reasons to the contrary, e.g., 
age is a demonstrably meaningful concept, and he would object to a 
factor which was partly age and partly something else, even though 
the variance of such a vector is greater than that of age alone. De- 
monstrable psychological meaning is of first importance, but let us not 
cry down a vector of mind with the practical utility that attaches to 
large variance because it does not have a priori psychological mean- 
ing in favor of some other vector equally without it. It is not suf- 
ficient for a psychologist to assert that his vector has psychological 
meaning; to be given credence it must be demonstrable in some such 
objective way as age is demonstrable. 
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It is granted that for the vector of greatest variance to be im- 
portant because of its variance, the original metric given by the initial 
tests as weighted must be representative of the psychological field that 
one is concerned with. If given an initial battery of tests, all weighted 
equally, in which there is a large number of pure or semi-pure spatial 
relationship tests, then the principle axes will be “spatial relation- 
ships” entirely independent of whether the educational or guidance 
problems to be solved by giving the tests to individuals involve spatial 
relationship situations or not. Such emphasis is unreasonable and must 
be avoided. 

The authors write, p. 134, “It is usual to express the scores on the 
tests in ‘standard measure’ so that o?, = o*, = o?, = --- = 1.” All out- 
comes are affected by this usual procedure, but no justification or dis- 
cussion of it is given. Thus the crux of the factorial problem is casually 
dismissed. We should not build upon a foundation whose only claim 
is that it is usual. The selection and the weighting of the initial tests 
is entitled to even more consideration than that which has been lav- 
ished upon subsequent techniques. 

The authors, drawing analogies from non-linear relationships ex- 
isting in the physical sciences, seem to advocate some sort of a trait 
measure starting from zero as do centimeters, grams, and seconds in 
the laws of physical science, but how this concept can be applied to 
the psychological field is not made clear. They write, footnote p. 134, 
“.. to introduce the device of referring [measures] to the mean of 
a group itself leads us away from the concept of a trait as a charac- 
teristic of an individual; of course if the traits in which we are in- 
terested are those of the group, this matter may be quite different.” 
Again they write, footnote p. 134, “In this statement [involving stand- 
ard measure] we see again the encroachment of the ideological ele- 
ment ‘group’ upon the ideological element ‘individual.’ It is doubtful 
if there is utility, it may be that there would be disutility in introduc- 
ing standard measures of such a thing as pressure or temperature or, 
in the psychological field, of I.Q. or any other individual trait. Ordi- 
narily in scientific measurements ‘zeros’ of scales and ‘units’ of scales 
are not defined relative to groups.” 

It might be intellectually stimulating in a way for Robinson Cru- 
soe to study his man Friday unrelated to himself or other mortals, but 
the present writer fails to see how such a study would have any rela- 
tionship to that broad field of individual differences of which factor 
analysis as thus far developed is a part. Instead of being led away 
by reference of measures to a mean we are by this device thrown into 
the problem of individual differences. What is the ideological “indi- 
vidual” divorced from “group”? If not referred to group it must be 
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referred to something else, perhaps the c.g.s. system in which case it 
then becomes a matter of physics, or perhaps to something else, mak- 
ing it biology or physiology, but the psychological problem of indi- 
vidual differences is gone. Not only must we retain the concept group, 
but we must carefully define it. The present writer has argued for a 
group consisting of individuals competitive in the fields — educational, 
vocational, or what not — that we are attempting to understand and 
make adjustments in. Surely it is an argument in favor of factor 
analysis that it does make use of measures of group central tendency 
and variability. 

Pertinent in this connection is the authors’ illustration of a fac- 
tor analysis of bricks with mean dimensions 12, 6, 1, and their obser- 
vation of its disutility in connection with bricks with mean dimen- 
sions 4, 2, 1. Passing over the remoteness from psychology of the 
brick problem because of the non-linear relationships between length, 
breadth, and thickness and volume, we may note that 4, 2, 1 bricks 
are not competitive with 12, 6, 1 bricks and we are quite indifferent 
to the disutility of the first analysis in the second situation. There is 
no search for timeless, spaceless, populationless truth in factor analy- 
sis; rather, it represents a simple, straightforward problem of descrip- 
tion in several dimensions of a definite group functioning in definite 
manners, and he who assumes to read more remote verities into the 
factorial outcome is certainly doomed to disappointment. 
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A RE-ANALYSIS OF 
A TEST OF THE THEORY OF TWO FACTORS* 


ROBERT BLAKEY 
GLENWOOD STATE SCHOOL 


The study of William Brown and William Stephenson, “A Test 
of the Theory of Two Factors,” is re-analyzed by means of the Thur- 
stone multiple factor methods. No tests or correlations are left out 
of the original table of correlations as is done in the original analy- 
sis in an attempt to validate the two-factor theory. Space, verbal, 
and perceptual speed factors similar to those found by Thurstone, 
Wright, and Garrett are identified. A common factor of “Matura- 
tion” is postulated to account for the remaining communality of the 
tests. A fifth factor is considered to have no significance due to the 
small amount of variance which it contributes to the total.t 


INTRODUCTION 

Between the group of psychologists who have accepted Professor 
Spearman’s theories of factor analysis and the Thurstone school of 
factor analysts, there has been in the past a considerable controversy 
as to the most logical and precise factor interpretation to be used on 
a set of correlations. 

In 1932, William Brown and William Stephenson set out to “.... 
furnish a precise and definite solution of the problem of a Central In- 
tellective Factor” (1, p. 173). They gave a group of twenty tests to 
a large group of subjects. They computed the tetrad differences among 
the test correlations and after discarding all tetrads involving one 
test and one correlation coefficient, they showed that the revised dis- 
tribution would come within a close approximation of the expected 
distribution assuming a central intellective factor “g” and a specific 
factor “‘s’” for each variable. 

Godfrey Thomson (7) in a note on Brown’s and Stephenson’s 


study said, 


On no theory at all, by the laws of probability, there will be a ten- 
dency towards zero tetrad-differences among correlation coefficients. 
This tendency is the stronger, the more complex the background of 
causes concerned in the correlations; and the mind is surely complex. 


* Master’s Thesis, University of Chicago, Department of Psychology, 1989. 
+ Grateful acknowledgment is made to Dr. L. L. Thurstone for his sugges- 
tion of this study and for valuable criticism and suggestions during its prepara- 


tion. 
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When in addition the tests are selected “after’much preliminary trial” 
and later it is “found necessary to reject one of the tests” (i.e. to reject 
2907 tetrad differences) and then further to reject “one of the correla- 
tion coefficients” which remain (i.e. to reject 272 more of the tetrad-dif- 
ferences) it would indeed be surprising if the surviving data did not 
show a very strong tendency to zero tetrad-differences. Nor can I for 


the life of me see what is then proved. 


To this note of Thomson’s, Brown and Stephenson (2) replied 
as follows, 


We might add that any number of sets of correlations could be sup- 
plied which do not satisfy the tetrad criterion, and cannot be made to do 
so on any subsidiary theory whatever. ... . Purely as practical psy- 
chologists we require to separate firstly the broadest psychological fea- 
tures, if there are any, and to control them one by one. ... . First 
a search is made for errors other than sampling — surely an eminently 
sensible aim — since we cannot see why sampling should blind us to 
the possibility of other errors. Second, a search is made for possibility 
of overlap — a matter made explicit in the very theory of two factors it- 
self, and not, as Thomson seems to think, a further theory of additional 


excuse. 


This discussion and the critical nature of the study of Brown 
and Stephenson (3) make the analysis by the Thurstone techniques a 
particularly logical choice. 

The purpose of this paper is (a) to analyze the study of Brown 
and Stephenson by the Thurstone techniques, (b) to attempt to give 
a psychologically meaningful interpretation to any factors found, (c) 
to attempt to determine the existence and nature of the general fac- 
tor, (d) to compare the factors found with those of other studies, (e) 
to determine if the factors seem to be correlated in this set of data, 
and (f) to compare the results of the Thurstone analysis with the 


two-factor analysis of this study. 


RESUME OF THE BROWN AND STEPHENSON STUDY (3) 


Twenty tests (or abilities) were selected, not on a narrow a pri- 
ori basis, but so as to fit the theoretical criterion of zero tetrads. 

The tests were administered to three hundred boys aged ten to 
ten and a half years. The boys were normal both physically and men- 
tally. All those who were very far out of their age-grade placement 
were not used. A fore-exercise was given before each test, and, in 
some cases, blackboard demonstration was used to put across the idea 
of the test. The tests were given in four one-hour periods. 
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The tests used were as follows: (the code number of the tests in 


the present study are used) .* 


* The original tests could not be secured, so the description is given from the 


Test No. 
Test No. 
Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 


Test No. 
Test No. 


2. 


10. 


a4. 


12. 


13. 


literature and may not be completely accurate. 


1. 


Inventive Synonyms — the problem is to write a 
word meaning the same as the stimulus word. 
Disarranged Sentences — the problem is to rear- 
range the words to make a good sentence. 
Understanding Paragraphs — probably of com- 
pletion type for digestion of content. 

Inferences — probably multiple-choice type of an- 
swer to given statement. 

Verbal Classification — problem is to underline 
response item of same classification as three given 
items (example: hat, coat, gloves — wear, scarf, 
dog, boy, house). 

Proverbs — probably selecting two analogous 
proverbs from a group. 

Alphabetical Forms — the problem is to indicate 
what other letters could be made from parts of a 
large given capital letter (example: M — N, V). 
Alphabetical Series — the problem is to fill in 
blanks in series of XO type (example: XX00XX- 
00XX00XX00). 

Cancellation — probably selective single letter 
cancellation in pied letters. 

Mutilated Pictures — a part of a picture, such as 
the handle on a cup, is missing. The problem is 
to indicate where a part is missing or a wrong 
part present. 

Arithmetical Equations — the problem is to in- 
sert correct signs to make an equation balance 
(example 4 3 2 = 5 is made 4+3—2 =5). 
Fitting Shapes — problem is to mark a geometri- 
cal figure to indicate how it could be cut up into 
three given smaller parts (one type of form- 
board). 

Mazes — usual type of Porteous Mazes. 

Pattern Perception — problem is to encircle a 
pattern of small crosses so that it matches an ex- 
ample given at the left. 

example: 
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+ +/[+\+ 
++ + 
++ + 
++ 

FIGURE A 


Test No. 15. Form Analogies — the problem is to indicate the 
form having the same relation to the third form 
as the second has to the first. 
example: 


A:ia'L} 2~oOAgO 


FIGURE B 


Test No. 16. Classification (Rights) — A rule for classifica- 
tion of forms and colors must be formed and used. 

Test No. 17. Overlapping Shapes — the problem is to indicate 
within which of a group of overlapping geomet- 
trical forms a number lies. 

Test No. 18. Abstraction (pairs) — the problem is to indicate 
which of a group of figures contained properties 
common to each of two stimulus groups (this test 
is similar to Spearman’s Figure Classification 
test but has less difficult items). 

Test No. 19. Code — the problem is to mark the number corre- 
sponding to a code item. 

Test No. 20. Code Parts — the problem is to mark the number 
corresponding to a part of a code item (the parts 
were non-ambiguous). 

Brown and Stephenson first obtained the raw sc. es and then 
normalized them (previously, it had been found that normalized scores 
were frequently necessary for a tetrad-difference criterion fit) (6). 
The correlations were then calculated by the difference method, a 
variation of the Pearson product-moment method. The verbal factor 
was recognized and partialed out, a specificality between Code and 
Code Parts was partialed out, and all the tetrad-differences of the re- 
vised correlational table were computed. The results did not fit the 
expected probable error, so, upon further subjective analysis, it was 
decided to throw out the test “Fitting Shapes” and the correlation 
between “Verbal Classification” and “Paragraph Understanding.” 
The former was discarded because it probably identified a new group 
factor, the latter because it contained a specificality due to speed 
preference. A revised distribution of the remaining tetrad-differences 
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agreed very closely with the expected differences to be found if only 
a general factor and a specific factor were involved in the perform- 
ance of the tasks. 

They concluded that their main purpose was achieved, namely, 
to establish Spearman’s two-factor theory on an experimental statis- 
tical basis. 


PROCEDURE IN THE PRESENT STUDY 

The correlations used are those secured by Brown and Stephen- 
son befort partialing out the verbal factor and code specificality (3, 
p. 368). The mean correlation is .414 with standard deviation .0873. 
The correlations were factored using Thurstone’s centroid method 
(8). The centroid factors are listed in Table 1. The number of fac- 
tors extracted from the data and used was five. This number was de- 
cided upon for several reasons: 














TABLE 1 
Centroid Matrix (F) 
‘ | Code | factors 
ee iel 2) @, 2, @, 2 








717| 442; —.096| —.105| —.068 
.779| 214; —.084) —.019, —.133 
804} 854) .055) —.142| .093 


Inventive synonyms .................. 
Disarranged sentences .............. 
Understanding paragraphs ...... 


Cc - . : : kn, ta et a ae oe .707 BIT 074 117! .037 
Verbal classification -................. .707 .057 | —.065 070 | —.144 
SEES ence OR .610| .3850| .089| 091; .162 


No 
1 
2 
3 
4 
5 
6 
Alphabetical forms .................... 7 .608 | —.092 | —.107 123 | —.227 
Alphabetical series .................. 8 .758 | —.073 | —.090 .198 | —.026 
CS ee oe eines Sea 9 .481| —.200 .133 | —.178 | —.105 
Mutilated pictures .................... 10 .487| —.069 | —.196 | —.122 145 
Arithmetical equations ............ 11 -741| —.0385 075 .087 | —.093 
Fittany shapes .....................:..... 12 .675 | —.266 | —.825 | —.032 .099 
13 
14 
15 
16 
17 








co cy ha EUs REE do ae Be .512 | —.208 | —.121 | —.174 | .071 
Pattern perception .................... -708 | —.215 | —.172 | —.117|  .102 
pee ae ane ee .699| —.185 | —.094 148 | .165 





Classification (rights) ............ 


-688 | —.075 079 205 089 

















Overlapping shapes .................. | 622 | —.121 .110 -128 | —.052 
Abstraction (pairs) ................ 18 | 628) —.025| .163| .094| .055 
eT ES | 19 | 625] —152] .884| —.198 | —.155 
re 20 | .722| —.240| .175 | —.190 | —.067 





a) Ledyard Tucker’s criterion of significance of a factor indi- 
cates that the fifth factor is possibly meaningful. This criterion has 
Zle|  a—1 
Sr] = 21 


the absolute value of the correlations (including diagonal used) 


where |r| is the sum of 





been recently revised and is: 
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before any one factor is extracted, and 5\|p| is the sum of the absolute 
value of the residuals (including residual diagonals) after extraction 
of the factor; ” is the number of variables. The criterion value should 
be ‘below the value .905 for the factor to be significant. This is an 
empirical criterion which has been found to agree very well with ex- 
perimental results. Figure 4 shows the relation of this criterion to 


M = -.0018 
40 ; O= .0238 40r 
50 fF 50 
20 20 
10 101 

















FIGURE 1 
Algebraic distribution of fifth 
factor residuals. 
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FIGURE 3 
Coombs’ Criterion. 


FIGURE 2 
Absolute value distribution of 
fifth factor residuals. 
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FIGURE 4 
Tucker’s Criterion. 


the number of factors in the present study. 

b) The criterion of significance of factors devised by Clyde 
Coombs is that, in a twenty-variable problem, the number of negative 
signs in the correlational table after sign change must be less than 
136 in order for the factor for which the sign change was made to be 
of significance. By this criterion there are only four significant fac- 
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tors in the present study. The relations of factors to this criterion 
are represented in Figure 3. 

c) A further criterion used was the qualities of the distribution 
of fifth-factor residuals (Figs. 1 and 2). The distributions appear to 
be bimodal, but the variance is so small that it was decided not to 
take out more factors. The standard error of a zero correlation for 
a population of 300 is .058, and the standard error of a correlation of 
1 (mean r in original table of r’s) for a population of 300 is .048. 
The standard deviation of .0238 for the fifth-factor residuals is con- 
siderably less than either of the above. 

Since there is no harm in including an extra factor for rotational 
freedom, it was decided tentatively to use five factors and, if neces- 
sary, to add another later. However, it was found, upon rotation, 
that five factors were sufficient to define the configuration. 

In the attempt to identify simple structure (8), several methods 
of rotation were used. The most useful method was that of extended 
vecters; this method is described by Thurstone (9) in a recent publi- 
cation. The vectors were lengthened so that the termination of all 
the test vectors lay in a hyperplane orthogonal to the first centroid 














TABLE 2 
Rotated Factorial Matrix (F') 
| Code! Factor 
Variable No. | # op 4-8 | C D | E | E’ 
“Inventive synonyms ....... Ee 578 .022 .000 .624 | —.116 | —.045 
Disarranged sentences .... 2 346 .031 .026 -735 | —.1389 | —.004 
Understanding | 
paragraphs... ............... | 540 .059 104 .695 .092 110 

[vis 1c. | ee ae enenaase eines 342 | —.073 | —.052 .676 106 .242 


Verbal classifications .....! .419 | 081 —.071 .673 | —.128 .040 


3 
4 
5 
INI osc be cogs scasnnsscs 6 413 | —.046 | —.085 556 208 286 
7 
8 
9 





Alphabetical forms ........ = —.026 021 | —.006 644 | —.198 004 


Alphabetical series .......... 018 097 | —.092 -782 .000 201 


.457 | —.037 | —.029 











Cancellation ................--.-+ —.035 .062 334 
Mutilated pictures ............ 10 124 344 011 426 .019 | —.003 
Arithmetical equations ....| 11 068 | —.018 088 | .748 001 162 
Fitting shapes .................. 12 |—.042 463 | —.045 | .649| —.054 .007 
ee a eee eee 13 010 327 149 | 462} —.012 | —.037 
Pattern perception .......... 14 .030 384 093 | .660 .006 .028 
PREY since cccsersige 15 |—.047 .247 | —.088 | .705 151 .270 
Classification 

(AE ee renner es 16 .000 .031 | —.038 | .709 179 339 
Overlapping shapes ........ 17 |—.046 | —.025 077 | 648 .060 212 
Abstraction (pairs) ........ 18 .0638 | —.023 | .081 | .626| .167 274 
I cee ics cthasiiinccceaneesh .| 19 | 016 |—.120 | .505 | 597, .044| .067 











Code girts _W................. 20 |—.014 | .105 | .892 | 687) .022| .046 
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and at unit distance from the origin. Landahl’s (5) orthogonal rota- 
tion was applied, and oblique rotations were made. Other methods 
used were (a) oblique rotations on correct length vectors and (b) 
least square fit (on plane B) (10). The rotations were made without 
knowledge of the variable names corresponding to code numbers. The 
only criteria used were (a) the maximizing of zeros, and (b) a pos- 
tulated positive manifold. Rotations were then made to a subjective 
best fit of the reference axes. A good check on the validity of these 
criteria is the fact that the isolated factors are psychologically mean- 
ingful. The rotated factorial matrix is shown in Table 2. 

The only exceptions to the above-mentioned rotational procedure 
are in the case of planes D and E. It was observed that planes A, B, 
C, and E accounted for an appreciable proportion of the variance in 
only about one-half of the variables, and that there was one dimension 
of the five-dimensional system that was relatively unrepresented and 
could not be determined by a bounding hyperplane because of a lack 
of zero projections. Therefore it was decided to set up a normal or- 
thogonal to the normals of all the other planes, and not to try to fit 
this dimension by simple structure. This procedure was necessary 
in order to represent the total common factor variance. In the case 
of plane E, a bounding hyperplane was found, but the projection of 
the highest variables was so low that it was decided to keep E as a 


TABLE 3 


Transformation Matrix (A) 



































Centroid Reference Vector 
Axis 21 Pee od, ae E E’ 
1 | 205 | 138 | 104 | 061 032 167 
II | 883 | —336 | —.259 | —.103 | —.030 | —.029 
Ill =| —0s9 | —669 | 563 | .026 494 461 
Iv | —390 | —399 | —.735 | .28 .200 652 
Vesf| #86 485 | 612 ' —2583 | —.109 845 517 
TABLE 4 
Correlations Between Planes (A’A) 
i) a Plane 








| —001 | .002 | —.001 | _ ...... Sere | 268 
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Plots between reference axes. 
Effects of maturation on uncorrelated test scores. 
FIGURE 5 
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FIGURE 5 (continued) 
Plots between reference axes. 
Effects of maturation on uncorrelated test scores. 


residual plane and to minimize the total variance of all the variables 
on this plane. The alternative plane E’ (the bounding hyperplane) is 
included in the rotated factorial matrix (Table 2), but no significance 
is attached to its interpretation. 

Table 3 is the transformation matrix for the transformation of 
the centroid matrix to the rotated factorial matrix. No attempt was 
made to keep the reference axes orthogonal (except plane D), but 
Table 4 (the correlations between the reference axes) shows that for 
all practical purposes the system is orthogonal. The smallest angle be- 
tween normals is about 84 degrees. All the plots between the refer- 
ence axes are given in Figure 5. The graphs are made as if the angles 
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between the normals were 90 degrees, and the distortion is insignif- 
icant. 


INTERPRETATION OF THE FACTORS 


In the interpretation of the factors in the rotated matrix, all en- 
tries between .100 and —.100 shall be considered essentially zero, en- 
tries .100 to .300 of small analytical importance, and entries above 
.300 of interpretative significance. 

In the case of plane A, there are six variables which have factor 
loadings of a size sufficient for interpretative significance: 


1. Inventive Synonyms ‘ iy cee 
3. Understanding Paragraphs a 
5. Verbal Classification .... . .419 
6. Proverbs. . ty a ae a 
2. Disarranged Sentences oe ee 
ee a | A 


There seems to be very little doubt but that this factor is ‘‘Ver- 
bal” in nature. Brown and Stephenson recognized the presence of a 
verbal group factor and partialed it out of their correlations. By the 
Thurstone type of analysis of the unpartialed correlations, the factor 
shows up clearly and distinctly. The factor apparently has to do with 
verbal reasoning or meaning rather than word usage. 

Column B has five variables of interpretation significance: 


12. Fitting Shapes ...... . .463 
14. Pattern Perception. . ... . .3884 
10. Mutilated Pictures. .... . .344 
13. Mazes .. OREO rT arom: | 
15. Form Analogies gi die 2 wl: <li 6, fea) ee 


The essential character that these variables appear to have in 
common is a type of visual or spatial imagery. It is necessary for the 
subject to be able to grasp an image or spatial representation of the 
problem for its successful solution. The factor appears to be con- 
cerned with variables of a problem nature rather than those of a 
classification or speed type. We might call this factor “Space.” 

Factor C has only three variables with projections of a size suf- 
ficient for interpretative significance: 


19. Code. . Re iia a er et 
20. Code Parts . at. ke eee 
9. Cancellation PRR RPNR! Selo. AWE be gS oe 
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This factor might be called ‘Perceptual Speed.” The common 
factor seems to be the ability to perceive small detail and to classify 
it quickly. The tasks must necessarily be at a low level of difficulty. 
However, the perceptual aspect of the ability appears to be of more 
importance than the speed, as is indicated by the considerably larger 
projection of Code over Cancellation, the latter being more of a speed 
test. 

Factor D is the orthogonal factor which was set up to be uncorre- 
lated with all the other factors. Every variable has a projection of 
greater than .400, the lowest being: 


10. Mutilated Pictures . ... . . .426 
. Sn . « « «is *& » » oe 
oes: Seneeeans FO oe ee en ones SS Ga 


and the highest: 


8. Alphabetical Series. . . . . . .782 
11. Arithmetical Equations ... . .748 
2. Disarranged Sentences .. . . .735 


One difference between these two groups appears to be a differ- 
ence in the amount of learning involved in doing the task. Tests 8, 
11, and 2, most certainly, are more influenced by differences in rate 
of learning than are Cancellation, Mazes and Mutilated Pictures. 
However, on the other hand, all the tests are dependent upon the 
ability to learn and so should have a significant projection on this 
factor, as they do. 

Of course this factor could be interpreted to be analogous to 
Spearman’s central intellective factor; it is most certainly present in 
every test and accounts for a large part of the common factor vari- 
ance of every test. It does not in any way invalidate the theory of 
multiple factor analysis to postulate a general factor. In this analy- 
sis, the general factor appears to be indeterminate and was arrived 
at by setting it orthogonal to the other factors. 

Another interpretation that could be applied to plane D is that 
the factor is due to differences in maturation of the subjects. This 
explanation has also been postulated by Wright (13) in her paper on 
the factor analysis of the original Stanford-Binet scale. The diagram 
in Figure 6 has been constructed to demonstrate the effect of differ- 
ences in maturity on the scores of otherwise uncorrelated tests. If 
measurements were made when two individuals were at different 
points on the developmental scale but both were at the period of steep 
slope of the curve, then the individual who was farthest along the 
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FIGURE 6 


scale of maturation would tend to have the highest score on all abil- 
ities although at maturity there might be no correlation between the 
abilities. The artificially high correlations thus produced would tend 
to introduce a general] factor with the highest projections on the fac- 
tor made by those tests whose scores were most affected by matura- 
tion. It seems plausible that scores on Alphabetical Series and Arith- 
metical Equations would be more affected by maturation than would 
scores on Mazes or Cancellation. Probably the preferred interpreta- 
tion for this factor then is “Effect of Maturation of Subjects.” The 
maturation theory differs essentially from the concept of Spearman’s 
“g” in that it can apply only to analyses of tests given to children. 

Still a further possible interpretation of the factor D is that the 
children at this age level used the same ability for the solution of 
every test. Some such ability as induction might have been used, as 
any test appears to be quite a problem to children at this age. How- 
ever, some other common factor, as yet unidentified, might have been 
used. 

Another possible interpretation is that the common factor is 
present due to the lack of differentiation of the group factors from a 
single common factor because of the young age of the subjects. This 
interpretation is a corollary to the maturation hypothesis. 

Plane £ is a residual plane. It was first made a bounding hyper- 
plane, but there were no variables with sufficient projection on the 
normal to identify it, so it was rotated to minimize the variance of 
all the variables. It is essentially orthogonal to all the other factors. 

Plane £’ is the original bounding hyperplane HF. It is included 
here for the sake of completeness, but no attempt at interpretation 
is made. 
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COMPARISONS 


This study agrees very well with other factor studies made by 
multiple-factor analysts. Thurstone (11) in his recent monograph 
finds the Space, Verbal, and Perceptual Speed factors among those 
identified in his analysis of fifty-seven tests. He also found these 
same factors represented in two other major factor studies, one of 
which has already been published (12). In this study he attempts 
more fully to analyze the Perceptual Speed factor. He describes it 
in this manner. 


The characteristic that seemed to be common to all of the tests that 
were heavily saturated with the perceptual factor “P” was the readi- 
ness to discover and to identify perceptual detail. .... It might in- 
volve speed as an essential characteristic, but this impression may be 
due to the fact that the perceptual tests were simpler than the tests 
which were heavily saturated with other factors. 


The Verbal factor identified in Thurstone’s study was character- 
ized by Sentence Completion, Opposites, and Verbal Analogies tests. 
The Space factor was characterized by Flags, Pursuit, Designs, and 
other form tests (12). No general factor has been found in any of 
Thurstone’s analyses, but this resulit is as would be expected if the 
general factor in the present study is due to the effect of maturation. 
His studies were made on high-school seniors and college students. 

To further bear out the hypothesis of the effect of maturation 
on test scores, Garrett, Bryan, and Perl (4, p. 273) have found pro- 
gressive reduction in the intercorrelations of a variety of mental 
tests administered to groups of nine-, twelve-, and fifteen-year-old 
children. Factor analyses of Schneck’s and Schiller’s studies (4) 
also gave much higher correlations between tests for the younger 
children of Schiller’s test group than for Schneck’s group of college 
students. 

Garrett (4) also found the Verbal and Space factors among 
others in his analyses. 

Wright (13), in her analysis of Stanford-Binet data, found a 
general factor similar to the factor D in the present study. She was 
also able to identify a Spatial and a Verbal factor. 

The most interesting comparison is between the Thurstone and 
the two-factor analyses of Brown’s and Stephenson’s data. The two- 
factor analysis showed that, after manipulating the correlations in 
various fashions because of “error” of several types, a distribution 
of tetrad-differences could be secured that would indicate only a “gen- 
eral” and specific factors to account for the test intercorrelations. 
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The Thurstone analysis, made on the original correlations (there be- 
ing no need to partial out a verbal group factor), showed that there 
were three interpretable group factors, and a general factor which 
was possibly due to maturation differences of the subjects. There was 
no need to postulate possible errors due to “speed preference” (3, p. 
360) or unknown group factors. The concept of error may be ques- 
tioned in this case. It seems that such phenomena are not errors, but 
valid indications of relations of the variables. 

In all, it would appear that the Thurstone method is a more thor- 
ough, accurate, and fertile method of analysis than the two-factor 
method as applied to this group of tests. 


SUMMARY AND CONCLUSIONS 


In a re-analysis, by the Thurstone techniques, of twenty tests 
used by Brown and Stephenson in a Test of the Theory of Two Fac- 
tors (3), it was found that three group factors and a general factor 
were present: 

a) Factor A, a verbal factor; 

b) Factor B, a spatial factor; 

c) Factor C, a factor of perceptual speed; 

d) Factor D, a general factor possibly due to maturation. 

The factors were found to be orthogonal except for chance devi- 
ations. 

The factors were thought to be consistent with other multiple 
factor analysis findings. 

The Thurstone method of analysis is thought to be a more effici- 
ent method of factor analysis than the two-factor method of factor 
analysis as applied to these tests. 
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In tetrachoric correlation there is a four-fold distribution based 
on the four combinations of A’s and not —A’s and B’s and not —B’s. 
The analogous problem in test construction is that of determining the 
relationship between two items or test questions, upon which all in- 
dividuals are scored pass or fail. Diagrammatically, the desired data 
may be represented as below: 











F, P, 
P,| PFs P, P, for 
F,| F,F, F, P- fr: 
fre fs N 


Thus nine values are available to work with, the four marginal totals, 
the four cell frequencies, and N. Note that given N and any three of 
the other values it is possible to determine the remaining five values. 

If it is desired, all nine values may be obtained directly and 
simultaneously on the tabulator without any sorting of the cards, pro- 
vided that the appropriate columns are x-punched, and that the cards 
are controlled by means of class selectors.* It is also possible to ob- 
tain simultaneously any or all of the various cell frequencies and sub- 
totals expressed directly as percentages of the sample. Normally, all 
that is needed is f, , f. and f,f.. These three values correspond to the 
b,c, and a values of the computing diagrams for tetrachoric correla- 
tion, devised by Chesire, Saffir, and Thurstone.} If no more than this 


* This first procedure is specifically adapted to the International Business 
Machines numerical tabulator provided it has extra class selectors. Modification 
of the technique will be necesary for those having machines with fewer class se- 
lectors. Tabulator equipment with digit selection can also be adapted to this 
type of analysis. 

+ Chesire, L., Saffir, M., and Thurstone, L. L. Computing diagrams for the 
tetrachoric correlation coefficient. Univ. Chicago Book Store, 1933. 
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is required, a simple wiring pattern for the Hollerith tabulator is out- 
lined below. 

The class selector is energized from an impulse in a given control 
column and will then allow the impulses from those cards having an 
x-punch to be led off to one counter, while the impulses from those 
cards not «-punched may be led to another counter. This is due to the 
fact that a card with an x-punch in the control column causes a relay 
to throw the adding impulse into a counter other than the normal one, 
while impulses from non «-punched cards go directly to the normal 
counter. 

Consider two items, numbers 1 and 2, which are punched into 
columns 1 and 2. Let an x-punch indicate a right or pass response and 
a no—z a wrong response or no answer. Then in columns 1 and 2, the 
following combinations of punches might appear: 2,22; 2,,n0O—22; 
no—,, x, ; and no—«,, no—2, ;i.e., pass pass, pass fail, fail pass, and 
fail fail. 


1. Use one card for each individual, letting each column stand for an item. 
Punch in each column an « if the item is passed. Skip the column if the 
item is failed or omitted. 

2. Punch into columns 1-4 the first four significant figures of the reciprocal of 

N or the number of individuals. 

Connect the control brush of column 1 with the hub of class selector 1. 

4. Connect positions 1-4 of the add brushes to the “C” position of class selec- 
tor 1. 

5. Connect the first four positions of the no-x row of class selector 1 with po- 
sitions 6-9 of counter 1. Thus the left half of counter 1 will contain the per- 
centage of individuals getting item 1 wrong, the 6 value of the computing 
diagrams. 

6. Energize class selector 2 by connecting with the control brushes of column 2. 

7. Connect positions 6-9 of counter 1 with the first four positions of the “C” 
row of class selector 2. 

8. Connect the first four positions of the no-« line of class selector 2 with the 
first four positions of counter 1. This will give the percentage of individuals 
who failed both items, the c value of the computing diagrams. 

9. Energize class selector 3 by connecting with a hub of class selector 2, i.e., 
controlling on item 2. 

10. Connect positions 1-4 of the add brushes to the “C” position of class selec- 
tor 3. 

11. Connect the first four positions of the no-x’s of class selector 3 with the 
first four positions in counter 2. This will give the percentage of individ- 
uals who failed only item 2, the a value of the computing diagrams. 

12. The switches on the tabualtor should be set as follows: 

a) tabulate. 

b) all list switches off. 

c) all controls off. 

d) counter switches set at minor total. 
€) summary punch switches off. 


~ 
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f) automatic start and reset switches on. 

18. The only change in wiring to get the values for the tetrachoric correlation 
between items one and three is changing the control wire from position 2 
to position 3. 

14. Using this procedure it is not necessary to sort the cards. Simply place 
them in the feed rack and start the machine. As soon as the totais are 
printed, change the single wire and again insert the pack of cards in the 
feed rack and start the machine. 


Since the cards go through the machine at the rate of 150 cards 
per minute, a population of 450 individuals can be handled in three 
minutes. Allowing for changes in wiring this would mean at least 15 
inter-item correlations per hour. Thus a 45-item test could be com- 
petely analyzed in 66 hours. If, however, five class selectors are avail- 
able instead of three, this time can be cut in half. 


AN ALTERNATIVE METHOD 


Occasionally it may be necessary to analyze a test containing 
more than eighty items. The method about to be described makes it 
possible to handle up to 960 items on a single card when a card counter 
is attached to the card sorter. Allow each position in a column, includ- 
ing x and y, to represent an item. If the individual passes the item, 
do nothing; if the individual fails the item, punch the position assigned 
that item. For example, if items 1 to 12 inclusive are assigned to col- 
umn 5, and item 1 is failed, punch out the x position; if item 2 is 
passed, do nothing; if items 3, 4, 5, and 6 are passed, do nothing; but 
if items 7 and 8 are failed, punch out positions 4 and 5; assuming that 
items 9, 10, and 11 are passed, no position is punched, but if item 12 
is failed, punch out the 9. After a card has been prepared and verified 
for each individual, the next problem is sorting to give the desired 
values, namely those failing each item, and the number failing any 
specified pair of items. The steps involved in doing this are outlined 
below: 


1. Set the switches on the card sorter to sort and count. Also depress the rotor 
switches except the one for the x pocket. Set the sorting brush in the de- 
sired column. 

2. Sort the cards. The result will be two packs, one for those who passed item 
1 (found in the reject pocket), and one for those who failed item 1 (found 
in the x pocket). 

8. Record from the card counter the number of individuals who failed item 1, 
item 2, etc. to item 12. 

4. Remove the pack of cards from the reject pocket and place in a convenient 

place until needed. 

Depress rotor switch for the x position. 

6. Clear the card counter. 


s 
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7. Now take the pack of cards from the x pocket and again sort with all rotor 
brushes down. The card counter will now give the frequency of those in- 
dividauls who failed in item 1 and item 2, item 1 and item 8, etc. up to item 
1 and item 12. 

8. Again clear the card counter, but shift the sorting brush to the next col- 
umn which should contain items 13 to 24. 

9. Again sort, recording from the card counter the frequencies of failure of 
item 1 and item 18, item 1 and item 14, etc. to item 1 and item 24. 

10. In a similar manner determine the frequencies of the failure of item 1 and 
all the remaining items. 

11. Set the rotor switch for the y position, thus making it effective, and set the 
card brush in the column containing item 2. 

12. Using the entire pack of cards sort out those individuals who failed item 2. 
Again set aside those cards falling in the reject pocket, i.e., those indi- 
viduals who passed item 2. 

18. Depress the rotor switch for the y position. Clear the card counter. 

14. Run the cards for those individuals failing item 2 through the sorter and 
from the card counter determine the frequency of failure of item 2 and item 
3, item 2 and item 4, ete. In a similar manner determine the desired fre- 
quencies for all the remaining items. 


The main objection to this method is the amount of manual labor 
required in transcribing the frequencies; the likelihood of making er- 
rors either in reading or transcribing and the necessity of transform- 
ing the frequencies f; , f; , and f;; into percentages are also significant 
objections. The number of such percentages to be computed is 
1.5n(n — 1), where n is the number of items. If the number of such 
percentages to be computed is large in comparison with N , the num- 
ber of individuals (say, twenty times greater), then it will be profit- 
able to compute a table of 100 X 1/N. 

With this method it is not at all impracticable to attempt the in- 
ter-item analysis of even a 200-item test. The writer estimates that 
such a test, having an N of 1000 and an average item difficulty of 50 
per cent, would require 170 hours of machine time to determine and 
record the necessary frequencies. Transforming the frequencies to 
percentages would require another 150 hours. There would still re- 
main the considerable task of entering the computing diagrams with 
the proper percentages and determining the value of 7. 
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THE ROLE OF CORRELATED FACTORS IN FACTOR ANALYSIS 


LEDYARD R. TUCKER 
THE UNIVERSITY OF CHICAGO 


The fundamental factor theorem is developed in matrix form 
for the case of correlated factors. The properties of the correlated 
factor system are discussed, and some effects of sampling error con- 
sidered. The psychological meaning of correlated factors is dis- 
cussed, and several mechanisms by which general factors may oper- 
ate in the factorial system are indicated. 


It is the purpose of this paper to develop a generalized factor 
theorem which applies directly to any set of correlated or uncorre- 
lated factors and to consider the interpretation of correlated psycho- 
logical factors. 

The first and most basic assumption in factor analysis is that a 
test score may be expressed as a linear combination of scores on sev- 
eral factors. This assumption may be written in the form of an equa- 
tion: 

o> Qj1X 14 + Ajo%oi + Aj3X3i eves ft Oye; ’ (1) 
where s;; is the standard score of individual 7 on test 7. The x’s are 
the standard scores of the individuals on the factors 1, 2, 3, ---, ¢ ; the 
a’s are the weights to be applied to the factor scores in obtaining the 
individual’s score on the test. The standardization of these scores is 
assumed to be based on an infinite population. This equation may be 
written in matrix form: 


S=FP, (2) 


the entries in S being s;; , in F being a;,, and in P being x,; , where p 
is the subscript used to designate the factors. 

The correlations between two tests, 7 and k , may be expressed by 
the following formula: 


Pik = Fp USI: Sei, (3) 
w=1 


where N is the number of individuals in the sample used in obtaining 
the correlation. The standard deviations do not enter into the for- 
mula since they are unity for standard scores. Similarly, the correla- 
tion between the two factors, p and q, is 
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1 N 
Tpq = N = Lpitgi . (4) 
The correlations between the factors will be allowed to take any val- 
ues so long as the factors remain linearly independent. That is, al- 
though each factor must present some independent portion to justify 
its retention, some overlap will be allowed between the factors. 
Equations (3) and (4) may be written in matrix form: 


Dans 
R;= SS’, (5) 
R, = + PP’ (6 
ape ’ ) 


where RF; is the matrix of correlations between the tests and R, is the 
matrix of correlations between the factors. When the value of S in 
equation (2) is substituted into equation (5), this latter equation 
becomes: 


R; = Ww vere > 
= F(t Pepe’. 
N 
which becomes, by (6): 
R,;=FR,F’ .* (7) 


In equation (7) the fundamental factorial theorem of Thurstone (2) 
has been generalized to correlated factors. 

It will be noted that if the factors are uncorrelated, RF, is an 
identity matrix, and equation (7) reduces to Thurstone’s form of the 
theorem as a special case, namely, 


R,; = FF’. (8) 


It is of interest that a transformation exists which carries the results 
of the one type of analysis to the other. This transformation makes 
it possible to use the present methods of factoring to obtain an arbi- 
trary set of uncorrelated reference factors and then to transform them 
to a final set of factors which are either correlated or uncorrelated. 
Since the factors have been assumed to be linearly independent, 


* Since writing this manuscript, Professor Karl J. Holzinger has called the 
author’s attention to a publication by himself and Mr. Harry H. Harman in which 
our matrix equation was written in expanded form (7, p. 324, eq. 4). The ex- 
panded form is, however, extremely laborious in use, both in asettiens and com- 
putational work. 
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the rank of R, is equal to its order, and it possesses an inverse. R, 
may now be reproduced by: 

R,=HH', (9) 


where H is a matrix of rank and order equal to the order of R,. The 
matrix H also possesses an inverse. Substituting R, of (9) in (7), 


R; = FHH'’. (10) 
Then, if matrix F’, is defined as follows: 
F,=FH, (11) 
(10) becomes 
R; =F,F,, . (12) 


By (8) it is seen that F,, is a factorial matrix for a set of uncorrelated 
reference factors. Thus, the transformation from the correlated fac- 
tor case to the uncorrelated case has been completed. The reverse 
transformation is also possible. The transformation which may be 
applied to a matrix with uncorrelated factors is, by (11), 


F=F,H". (13) 


Equation (9) gives the correlations between the factors when a ma- 
trix H is known. Thus it is possible to use any of the factorial meth- 
ods that produce an uncorrelated set of reference factors and then to 
transform these factors to another set of factors which are either 
correlated or uncorrelated. 

In practice this transformation may be simplified by the defini- 
tion of the matrix A, as follows. Let 


A=H"D, (14) 


where D is a diagonal matrix such that the columns of A are unit vec- 
tors (that is, the sums of the squares of the entries in the columns of 
A are unity). Then F, is postmultiplied by A to obtain a matrix V , 


V=F,A. (15) 
But, by (14), 
V=F,H"D, 
and, by (13), 
V=FD, (16) 
or 
F=VD"., (17) 


The factorial matrix F is proportional by columns to V . 
By (14), 
H=D.A", (18) 
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and, by (9) 
R, = Da*(a") DTD’, 

(19) 
R, = D(A’ A)“D’. 


Thus, if A and V are known, F', H , and R, may be found by equations 
(17), (18) and (19), the condition on D being that the diagonal en- 
tries in R, must be unity. 

The reason for the use of A in performing the transformation 
from an original set of uncorrelated factors to the final set of factors 
can be discussed best after the investigation of the special case of un- 
correlated transformed factors. In this case, 


R,=1; (20) 
then, by (9), 

A =T, (21) 
and 

=H. (22) 


But the sums of the squares of the entries in the columns of H’ are 
already unity, as shown by equation (21) and therefore D, in equa- 
tion (14), is an identity matrix, that is, 


D=f, (23) 
Then by (14) and (22), 

Sa: (24) 
and by (21), 

aa 7. (25) 


By (15), (17), and (23), 
P=V=FiA. (26) 


The matrix V is the transformed factorial matrix F in this special 
case instead of being merely proportional by columns to F , as shown 
for the general case in equation (17). But this result may be used in 
the interpretation of any V , whether the factors are correlated or un- 
correlated. If a single column of a V is considered at a time, the corre- 
sponding column of A may be placed in a new matrix A, , which satis- 
fies equation (25), and therefore is a transformation to a set of un- 
correlated factors. That column of V then becomes a column of fac- 
tor loadings in the new matrix V, to which it has been transferred. 
Thus, the entries in any column of V may be considered to be loadings 
on a factor which is independent of the rest of the system. 

The advantages of using A in transforming an F, to a final F are 
that its use not only gives entries in V proportional by colmuns to the 
entries in F , but that these entries in V have a special significance, 
and that the condition that each column of A is to be a unit vector ap- 
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plies directly to that column without any reference to its relation to 
the other columns of A. In order to know H-, the entire matrix H 
must be known. The use of A allows each factor to be determined in- 
dependently. Since the entries in a column of V are proportional to 
the entries in the corresponding column of F’, zero entries are not 
altered in using V instead of F’', and the same simple structure will 
exist in the two matrices. But some other condition could be placed 
on the columns of A instead of their being unit vectors. Such a con- 
dition might be that the largest entry in each column of A be unity. 
The condition of unit vectors has been chosen because of the special 
significance of the entries in V, as previously developed, when this 
condition is used. 

The final transformation, A, may be built up either graphically 
(1, 2, 3, 4) or by means of analytic criteria (2,5). The matrix V may 
then be found by equation (15) and R, by equation (19). H and F 
are seldom found, since V is proportional by columns to F and there- 
fore may be used directly in the interpretation of the factors. 

The correlations between the tests and the factors are often de- 
sired for predictive purposes. The formula for these correlations is: 


N 


"2-H = SjiXpi » 
or, in matrix form: 
1 , 
Rip zs WN SP ’ 
where R,, is the matrix of correlations of the tests 7 with the factors 
p. 
By (2) 
Ry, =-+ FPP ; 
| ae WN ’ 
by (6) 
R;, = F R, oe 
by (9) 
R;, = FHH' > 
and by (11) 
R;, = F.H’. (27) 


It will be noted that only when the factors are uncorrelated and R, an 
identity matrix, is the matrix F also the matrix of correlations be- 


tween the tests and the factors. 


* This equation was written in expanded form by Professor Karl J. Hol- 
zinger and Mr. Harry H. Harman (7, p. 322, eq. 2). 
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But the development of the mathematical properties of the fac- 
torial system is not sufficient for a complete discussion of the effect of 
allowing correlated factors; the psychological limitations of the sys- 
tem must also be considered. The simplest case, which is to be con- 
sidered first, is for an infinite population when no sampling errors 
have been introduced into the system. The effects of sampling will 
be considered later. 

Thurstone (2) recognized three types of factors when dealing 
with uncorrelated factors. He assumed that there were factors com- 
mon to two or more tests, other meaningful factors present in only 
one test, and still other factors present in each test which arose from 
the chance errors of measurement. East test could have loadings on 
any common factor, but only on the specific factor and measurement 
error factor which were associated with the test. The only revision 
necessary in the present context is that the common factors may be 
correlated. The specific factors may be defined as being uncorrelated 
with all other factors. This definition throws the burden of descrip- 
tion of all the correlations between the tests upon the common factors. 
Further, the variations of the scores on any test which depend upon 
the test’s specific factor most surely should be independent of all other 
tests given in the battery, or that factor would not be specific to that 
test. In an infinite sample the variable errors of measurement should 
be uncorrelated with everything. The division of the factors into these 
three categories is based entirely on psychological reasons. If it is 
the operational unities of the mind that are to be discovered by the 
use of factor analysis, then these divisions are required. They are 
part of the restrictions placed upon the system by the science of psy- 
chology. No statistical or mathematical considerations can remove 
them; rather, they are to be dealt with mathematically and statisti- 
cally as a part of the problem. 

But we must take into consideration what occurs to the system 
when a finite sample is used rather than the infinite population used 
above. It will be assumed that, for any particular subject tested, equa- 
tion (1) still holds as long as that subject’s scores are expressed as 
standard scores in terms of the entire infinite population. While these 
scores are impossible to obtain, they are a good starting point in the 
investigation of the effects of testing only a sample of the population. 

If S, and P, are the sections of the S and P matrices containing 
the scores for the subjects used in the sample, then in accordance with 
equation (2): 


S, = FP,. (28) 


But these scores are not standard scores with reference to the sample 
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tested, for the means and standard deviations for the sample are dif- 
ferent from the means and standard deviations for the entire popu- 
lation. The first problem is to adjust these scores to standard scores 
in terms of the sample used. Equation (28) may be written in sum- 
mation form: 


t 
855 = J Ajppi , (29) 
pel 


t being the number of factors. The equations for the means of the 
scores on the tests and on the factors are: 


1 N 

mM; = NW 28 ; (30) 
1 N 

My, = —>= Lpi ’ (31) 


4V i=1 


m,; being the mean for test 7 , and m, the mean for the factor p. Sub- 
stituting the value of s;; from equation (29) into equation (30) and 
then rearranging: 


N 
This becomes, upon substituting for (7 > %pi) its equivalent, m, , as 
i=1 


given in (381): 
t 
m; = LT aj,m, . (32) 


p=1 
The scores may now be expressed as standard scores with respect to 
the sample used, u;; being the standard score on test 7 with respect to 
the sample, and y,; being the standard score on the factor p with re- 
spect to the sample: 


1 
Uji = — (855 — mj) , (33) 


1 
Ypi = —(Xpi— M,) , (34) 
op 


where o; is the standard deviation for test 7 , and «a, is the standard 
deviation for factor p. When the value of s;; in equation (29) and the 
values of m; in equation (32) are substituted into equation (33), 
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1 t t 
Uys = —( ZX GjpXpi — J ajym,) , 


Oj p= p=1 
1 t 
= <-> > § Qjp (Xpi itil My) ’ 
Oj p=1 
which becomes, upon consideration of equation (34), 


t 
Uji = YL —Ajpoppi - 


p=1 Tj 
Or, when 
= sa i. (35) 
then 
Uji = > DjeYoi * (36) 


The importance of equations (35) and (36) is that the linear com- 
bination of equation (1) still holds when a sample group is tested in- 
stead of the whole population. The only change is in the weights used, 
and this change is brought about by multiplying factors only. Thus, 
the simple structure is not changed by these effects of sampling, for 
zero weights must remain zero. 

But the sampling has affected the correlations between the fac- 
tors. The specific factors and measurement error factors are no long- 
er uncorrelated. It is now a statistical problem to separate these fac- 
tors from the common factors as well as possible, and thus to obtain 
as close an approximation to the true common factor matrix as pos- 
sible. One way open is to eliminate the specific variance and meas- 
urement error variance from the diagonal entries of the experimental 
correlation matrix and then to factor the resulting matrix. This elim- 
ination separates out a diagonal factorial matrix whose factors are 
uncorrelated in the sample used. But the side entries of the correla- 
tion matrix have been affected by the sampling and have not been ad- 
justed. The factors obtained from these correlations will also be af- 
fected ; but this is not all — a new section has been added to the fac- 
torial matrix. This new section is needed to complete the description 
of the side entries of the correlation matrix. It is rarely explicitly 
determined, its effects being left as residuals when the factoring has 
been completed. Since the sampling also affects the correlations 
among the common factors, it would be indeed surprising to find any 
set of uncorrelated factors which represent operational units in the 
minds of the subjects tested. Thus, some of the correlation between 
factors may be attributed to sampling error, but some of the correla- 
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tions between factors may arise from psychological phenomena. 

Psychologically, the factors may be neither ultimate nor indivis- 
ible. Further study of them probably will show that each factor de- 
pends upon the coordinated activity of many mental elements. Some 
factors may be primarily determined by the structure and dynamic 
activity of the human body. Other factors may originate in the train- 
ing and past experience of the individuals. Thus, one may think of 
different domains of factors. The factors in one domain may be af- 
fected by factors in other domains. The effect of these connections 
between factors in different domains is to leave the factors corre- 
lated. 

The author has found that the comprehension of the above-de- 
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FIGURE 1 


scribed relations and their implications could be simplified by the 
adaptation of a diagrammatic schema originally devised by Sewall 
Wright in connection with “path coefficients” (6). An example of this 
schema is given in Figure 1. The circles are used to designate the 
tests. The squares are used to designate the factors. Only the common 
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factors are represented; all specific and measurement error factors 
are omitted. The lines are used to represent a postulated “flow of 
variance,” the arrowhead indicating the direction of effect. The fac- 
tors are divided into two domains. The tests are affected by the fac- 
tors in factorial domain a which are, in turn, affected by the factors 
in factorial domain §. Factor 1 in the factorial domain a affects both 
tests 1 and 2, which are consequently correlated. Factor 2 in the fac- 
torial domain £ affects both factors 1 and 8 in the factorial domain a , 
which are, therefore, correlated. If the test intercorrelations were 
factored, the factors in factorial domain a would be obtained. Simi- 
larly, if the correlations between the factors in domain a were fac- 
tored, the factors in domain f would be obtained. The rank of the 
matrix of common factors in the tests is the number of factors in do- 
main a, which in the example is three. When factoring the correla- 
tions between the factors in domain a, the rank of the matrix of com- 
mon factors would be the number of factors in domain f, which is 
two in the example. This rank may be less than the rank of the ma- 
trix of common factors in the tests. But each factor in domain a has 
its own specific factor. This specific factor affects the correspond- 
ing factor in domain a and, thus, several of the tests. It therefore in- 
directly affects the correlations between these tests. Thus, the num- 
ber of factors in domain 6 which affect the intercorrelations of the 
tests is equal to the number of common factors in domain # plus as 
many specific factors as there are common factors in domain a. There- 
fore, the factors in domain f are not found when the test intercorre- 
lations are factored, the rank of this matrix of correlations being 
equal to the number of factors in domain a alone. 

The question of a general factor may now be discussed in terms 
of correlated factors and the schema described previously. Two pos- 
sible cases present themselves. Is the general factor one of the fac- 
tors in domain a, or is it one of the factors in domain f ? Figure 2 
presents these two cases. A simple example of six tests, two group 
factors, and a general factor has been chosen. In each case, tests 1, 
2, and 3 are affected by group factor 1; and tests 4, 5, and 6 are af- 
fected by group factor 2. In case I the general factor, G, affects all 
tests directly and independently of the group factors. In this case it 
is placed in domain a. In case II the general factor affects the group 
factors and then the tests. It is placed in domain f in this case. In 
case I the rank of the matrix of common factors in the tests is three. 
One of these factors, G , may not be found by the criterion of simple 
structure; but it can be set uncorrelated with the group factors which 
are found by this criterion. In case II the rank of the matrix of com- 
mon factors in the tests is two, one less than in case I. The group 
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factors are correlated in this case, and the general factor may be 
found by factoring the matrix of correlations between these factors. 
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EXPERIMENTAL STUDY OF SIMPLE STRUCTURE 


L. L. THURSTONE 
UNIVERSITY OF CHICAGO 


A battery of thirty-six tests was given to a group of high-school 
seniors. The factorial analysis reveals essentially the same primary 
factors that were found in previous studies. The test battery reveals 
a simple structure. 


In two previous factorial studies of large batteries of psychological 
tests,simple structure was found in the rotated reference frame (1, 2). 
The factorial experiment to be reported here constitutes a third in- 
vestigation of a large test battery with regard to the existence of 
simple structure and the psychological interpretation of the rotated 
reference axes. As a secondary problem we devised eight new tests 
which were inductive in character, according to a previous tentative 
interpretation of this factor. These new tests were included in the 
battery in order to determine whether they had a factor in common 
which would sustain the tentative interpretation of the inductive fac- 
tor. Most of the tests were revised forms of previous tests in that 
the new battery was arranged for machine scoring. Some of the tests 
were only slightly altered for this purpose, while other tests were 
altered considerably. In revising the tests for machine scoring the 
previous names have been retained for convenience in identifying the 
nature of the test content. The problem whether the factorial com- 
position of a test is invariant when it is moved from one test battery 
to another, or when the tests are given to comparable populations, 
must be answered in terms of future experiments in which the same 
tests will be incorporated in different test batteries for the same popu- 
lation and for comparable populations. 

In the first large test battery to be investigated by the factor 
methods (1), the experimental population was a group of 240 volun- 
teer students at the University of Chicago. They constituted a highly 
selected group. Our second study (2) of this kind was done at the 
Lane Technical High School on a class of several hundred seniors. The 
present study was made with 286 seniors at the Hyde Park High 
School in Chicago. The tests were given on five consecutive days, 
October 18 to 22, 1937. This population was different from the previ- 
ous experimental populations in that it represented more nearly the 
students found in high schools with a general curriculum and in which 
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a fairly large proportion of the graduates continue their education in 
one of the professions. 

In Table 1 we have a list of the names of the tests and their 
saturations on the primary reference axes. The tests were given in 
two parts, a fore-exercise with appropriate instructions, and the test 
proper. These parts were given with separate time limits. The time 
limits are listed in Table 1; the first entry is the number of minutes 
for the fore-exercises, the second the number of minutes for the test 
proper. The fore-exercise time was regarded as flexible. Each of the 
memory tests had four time limits, the first two for the fore-exercise, 
the third for study time, and the fourth for recall. Test 30 required 
no fore-exercise. 

We shall describe very briefly the nature of each test. For the 
reader who may want the complete set of the tests, we have filed with 
the American Documentation Institute, in Washington, D. C., a film 
record of the complete test battery, including instructions and fore- 
exercises (3). The availability of these film records makes it unneces- 
sary to reproduce in print the whole set of tests. 

Except for adaptation to machine scoring, the content of a large 
number of the tests was the same as in previous tests of the same 
names. These tests were: 1, Addition; 2, Areas; 3, Arithmetic; 4, 
Cards; 5, Completion; 7, Disarranged Words; 10, Identical Forms; 
11, Identical Numbers; 12, Initials Recall; 16, Mechanical Movements; 
18, Multiplication ; 20, Number Series; 22, Proverbs; 23, Pursuit; 28, 
Same-Opposite; 30, Spelling; 31, Squares; 32, Verbal Analogies; 33, 
Verbal Enumeration ; 34, Word Number. 

In assembling the new battery we retained two or three tests for 
each of the primary abilities that had been found in previous studies. 
To these were added certain new tests which were included for the 
purpose of investigating the several factors. The number factor N 
was represented by 1, Addition; 3, Arithmetic; 18, Multiplication; 19, 
Number Patterns; and 20, Number Series. The only new test in this 
list is Number Patterns. In this test the subject is shown a square 
with five rows and five columns. He is asked to discover the rule which 
determines the spatial arrangement of the digits and to determine the 
particular digit which belongs in a given cell. In the instructions and 
fore-exercise it is explained that the digits are arranged in consecu- 
tive order in all of the problems, that the sequence may run from 0 to 
9 or parts of this range, and that the problems differ only in the 

spatial arrangement of the numerical sequence in the squares. This 
test was devised as a test of induction which is numerical in content. 
The inductive character of the test is in the nature of the task, to dis- 
cover, for each square, the spatial arrangement of the sequences of 
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digits so that the missing number X can be determined. 

The verbal factor V was represented in 5, Completion; 6, Direc- 
tions; 22, Proverbs; 28, Same-Opposite; and 35, Word Patterns. The 
Directions test consisted of short instructions to be carried out by the 
subject on the test form. It was similar to a test of the same name by 
Woodworth and Wells. 

The space factor S was represented by 4, Cards; 9, Figures; and 
31, Squares, which were improved forms of previous tests for this 
factor. 

The memory factor M was represented by 12, Initials Recall; and 
34, Word Number. Both of these tests have been used previously in 
hand-scoring form. 

The perceptual factor P was represented by 10, Identical Forms; 
11, Identical Numbers; 27, Repeated Letters; 29, Scattered X’s; and 
33, Verbal Enumeration. These tests are similar to tests previously 
used. In Repeated Letters the subject ringed all letter combinations in 
which two or three adjacent letters were identical. The test form 
consisted of several pages of pied letters. 

The word factor W was represented by 7, Disarranged Words; 
and 17, Mirror Reading. In both of these tests the subject was asked 
to extract words from disarranged letters. 

The inductive factor J was represented in this test battery by 
several new tests. The tests for this factor were as follows: 8, Figure 
Grouping; 13, Letter Grouping; 14, Letter Series; 15, Marks; 19, 
Number Patterns; 20, Number Series; 21, Patterns; 35, Word Pat- 
terns. Figure Grouping is an adaptation of Spearman’s Figure Classi- 
fication test that was used in an earlier study (1). In Letter Group- 
ing the subject is shown four groups of letters with five letters in each 
group. Three of the groups have something in common, and the sub- 
ject is asked to mark the odd group. This test was designed by Mr. 
Herbert Landahl. In Letter Series the subject is shown a series of 
letters such as aabcedeef, and he is asked to write the next letter in 
the series. The test is arranged in increasing order of difficulty. It 
was designed by Thelma Gwinn Thurstone. The test called Marks was 
a paper-and-pencil form of the Yerkes multiple-choice test. Number 
Patterns was described in a previous paragraph. The test called Pat- 
terns involved the discovery of repetitions in a pattern of rectangular 
parts such as are used in some linoleum rugs. Word Patterns was a 
test of induction on columns of words. It was designed by Miss Leone 
Chesire. 

Deduction was represented by three syllogistic tests that varied 
somewhat in content and form. Two of these tests were adapted from 
reasoning tests by Cyril Burt. In addition to these deductive tests we 
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have the test in arithmetical reasoning with statement problems, 
which has been found previously to be deductive in character. The 
Mechanical Movements test is deductive in form but fundamentally 
different from the other deductive tests in content. The other deduc- 
tive tests are all verbal. Verbal Analogies is deductive, and the Num- 
ber Series test involves both induction in the discovery of the rule and 
deduction in checking the rule in the answers. 

Pearson product-moment correlations were determined for all 
pairs of tests. The product-moment correlations are shown in Table 2. 
The centroid matrix with eleven factors is shown in Table 3, and the 
frequency distribution of eleventh factor residuals in Figure 1. The 
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projections of the test vectors on the rotated reference frame are also 
shown in Table 1. In this table eight primary factors have been indi- 
cated in accordance with the nature of the tests which have large 
saturations on each primary. Three residual planes 9, 10, and 11 are 
left without interpretation. 

The matrix of the transformation A from the centroid matrix to 
the rotated matrix of primary reference axes is given in Table 4, and 
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the intercorrelations of the primary reference factors are shown in 
Table 5. 

In Table 1 the first column contains 22 factor loadings that are in 
the range +.10, and it contains three entries that are higher than .30. 
The three tests and their saturations on this factor are Identical 
Forms (.58), Scattered X’s (.85), and Verbal Enumeration (.54). 
These are tests which have previously been interpreted as representa- 
tive of the perceptual factor P. 

The second column contains 29 nearly vanishing entries in the 
range +.10, and it has three tests with saturations greater than .30. 
These are Addition (.54), Arithmetic (.38), and Multiplication (.62). 
Variable No. 37, Sex, has a negative loading of —.31 on this factor 
which indicates that the boys did better in the numerical tests than 
the girls. Two other tests had saturations near .30, namely, Number 
Patterns (.29) and Number Series (.27). These findings agree with 
the results of previous factor experiments in that the simple number 
tests such as Addition and Multiplication have the highest saturations 
on the number factor N, while the more complex tests involving nu- 
merical content have lower saturations on this factor. There can 
hardly be any question about the identification of this factor. 

The column W has 21 nearly vanishing factor loadings, and it has 
or possibly three, significant loadings. The tests are Disarranged 
Words (.42), Mirror Reading (.42), Letter Series (.34), and Num- 
ber Patterns (.30). This factor is tentatively identified by Dis- 
arranged Words. The reading of words in a reversed position, as in a 
mirror, has something in common with the deciphering of words in 
which the letters are presented in random order. The psychological 
nature of this factor should be investigated in order to determine its 
fundamental nature. In a previous factor experiment the Disarranged 
Words also showed high saturation on a factor which is independent 
of the verbal factor V. 

The column V has nearly 21 vanishing factor loadings, and it has 
significant loadings on the following tests: Completion (.58), Direc- 
tions (.42), Proverbs (.65), Same-Opposite (.68), Verbal Analogies 
(.41), Word Patterns (.46). There are lower but possibly significant 
saturations on the following: Letter Grouping (.31), the three syllo« 
gism tests (.34, .26, .35), Spelling (.86). This is clearly the same fac- 
tor which has been identified in previous experiments as the verbal 
factor V. Its identification seems to be as clear as that of the number 
factor. 

The column S has 31 nearly vanishing entries. The following 
tests have significant factor loadings on this factor: Cards (.63), 
Figures (.69), Squares (.42). This is the space factor S which has 
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been found in previous experiments. In previous studies of these tests 
the Pursuit test has involved different saturations in the space factor 
and the perceptual factor. In simplifying a test there seems to be a 
tendency for the factorial composition to become more perceptual in 
character. Whether this implies a shift toward some form of speed 
factor with the simplification of a test cannot be determined from data 
so far available, but this is a possible interpretation. Factorial study 
of speed of performance of simple tasks might reveal not only the 
relation of one or more speed factors to the primaries here discussed, 
but also the nature of the perceptual factor. 

The column M has vanishing projections on all the tests except 
the two memory tests which were included in this battery, namely, 
Initials Recall (.59) and Word Number (.58). 

The column 7 has no saturations so high as those which identify 
the number, verbal, space, and memory factors. This column has 19 
vanishing projections. Listing all the saturations above .25 we have 
the following: Directions (.29), Figure Grouping (.31), Letter Group- 
ing (.86), Letter Series (.39), Marks (.43), Number Patterns (.39), 
Number Series (.26), Patterns (.26), Pursuit (.26), and Squares 
(.27). The Word Patterns test has a loading of .24. No one test has a 
high saturation on the factor J, but it is probably significant that all 
of the eight tests that were specially designed to involve inductive 
thinking are included in the list. None of the specially designed in- 
ductive tests has zero saturation on this factor. All but three of the 
above tests were included in the battery for the specific purpose of 
representing the inductive factor. The three exceptions are Direc- 
tions, which was a new test of unknown factorial composition, the 
Pursuit test, which has been erratic before as regards the space fac- 
tor and the perceptual factor, and Squares, which was designed to be 
a test of the space factor in which it does have a higher saturation, 
namely, .42. Pursuit and Squares are both tests which can be done in 
at least two ways, one way being faster than the other. The discovery 
of the best way to do a test might be responsible for a small inductive 
component. According to these results the Yerkes multiple-choice 
test, which is here called Marks in a paper-and-pencil form, is the best 
test for the inductive factor. The next best tests for this factor would 
seem to be Letter Series and Number Patterns. The tests in this 
list have, on the average, about ten per cent of their total variance 
attributable to the inductive factor. It is a question for further ex- 
perimental study to determine whether tests can be devised for this 
factor which have higher saturations. Any test for this factor may 
be subject to the limitation that an inductive task requires some form 
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of content which may involve other primaries. Hence, the factorial 
composition of an inductive test may be necessarily more complex 
than tests for the other primary factors. This difficulty may be over- 
come by the discovery of some measurable parameters which repre- 
sent this factor more directly than the performance of a complex in- 
ductive task. 

The column D has 20 nearly vanishing entries. The tests with sig- 
nificant factor loadings in this column are Arithmetical Reasoning 
(.49), Mechanical Movements (.46), Number Series (.47), Verbal 
Analogies (.32), and the three syllogism tests (.27, .38, .86). This 
factor has been identified before as deductive in character. Number 
Series has saturation in both the inductive and deductive factors. 

Column 9 has only three or four tests with significant satura- 
ticus, namely, Identical Numbers (.46), Scattered X’s (.44), and pos- 
sibly Repeated Letters (.29), and Squares (.25). All other entries in 
this column are vanishingly small. All of these tests have in common 
that the subject hunts over the page, or through a column or a row, 
for some identity. This feature is prominent in the first three of the 
four tests listed. This characteristic of column 9 might be used as a 
basis for investigating a larger number of tasks with similar features 
in order to determine whether any clear primary factors can be found 
in them. 

The two remaining columns do not show any large saturations, 
and they are consequently left as residual factors. 

In order to determine the correlations between the primary fac- 
tors, we turn to the matrix A,,, of Table 4. The columns of this matrix 
show the direction cosines 1,,, of the primary reference vectors A). 
The direction cosines t,,, of the primary trait vectors T,, are propor- 
tional to the entries in rows of A-'m). In Table 5 we have the cosines 
of the angular separations between the reference vectors A, . The co- 
sines are given by the matrix product A’A. 

Table 6 shows the cosines of the angular separations of the pri- 
mary trait vectors T,. Since these are unit vectors, the cosines are 
also their intercorrelations. Most of the correlations are low, but 
some of them are appreciable. The perceptual factor seems to be quite 
independent of the other primary factors found so far except W. The 
number factor correlates higher than we should expect with the two 
verbal factors and with space and memory. The word factor has ap- 
preciable correlations with most of the other primaries as found in 
the present experiment. The matrix of Table 6 has been investigated 
to determine whether it can be interpreted as being essentially of 
rank one. If such an interpretation can be justified, we should be able 
to account for the correlations of the primaries by a single general 
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factor which might be the general factor postulated by Spearman. A 
best fitting single factor has been determined by a formula of Spear- 
man (4), and we have listed the residuals in Table 7. These residuals 
are small. The largest residuals are found with the deductive factor, 
but this factor is unstable and not yet so clearly indicated as the 
others. There is some justification for interpreting these results as 
indicative of a second-order general factor which accounts for most 
of the correlations between the primary factors. 

In Table 8 we have listed the saturation of each of the primary 
factors with the second-order general factor. It seems strange that 
induction is not represented. These saturations for the second-order 
general factor are quite different from those which have been recently 
determined for eighth-grade children in another factorial study. The 
correlations between the primary factors cannot yet be determined 
with the stability that is desirable for the investigation of second- 
order general factors. 

Our principal findings in this study can be summarized as follows: 

1) Primary factors have reappeared in several independent stud- 
ies. They represent distinct functional unities. The primary abil- 
ities about which we now have a good deal of confidence are: (1) the 
number factor N , (2) the verbal factor V, (3) the word factor W, 

(4) the space factor S , and (5) the memorizing factor M. It should 
be distinctly understood that the isolation of primary functional uni- 
ties of mind does not imply that these functions are indivisible. They 
are linearly independent functions, but they are not necessarily sta- 
tistically independent, or uncorrelated. 

2) The factors that reappear in successive studies but whose psy- 
chological nature has not yet been identified satisfactorily are: (1) 
the perceptual factor P , (2) the inductive factor J, and (3) a deduc- 
tive or restrictive thinking factor D. That some functional unities of 
this kind exist seems clear, but we do not yet have a sufficient under- 
standing of them to predict their behavior with certainty. To be sure, 
in the present investigation we constructed eight new tests for induc- 
tion, and they all showed some variance on the same factor, but the 
saturations are not so high as we have obtained for other factors. 

In response to many inquiries about these investigations we made 
available an experimental test battery for most of these factors. Our 
first determinations of the intercorrelations between the primary fac- 
tors were made by taking the inverse of a reduced matrix A , omitting 
some of the dimensions of the common factors. This was an error 
which we have since corrected by using the complete matrix A. The 
result of the correction is to increase the correlations between the pri- 
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mary factors. The composites for the primary factors have still high- 
er correlations because there are not available any pure tests for the 
primary factors. This limitation is universal in psychological and edu- 
cational tests which usually overlap in common elements that are not 
represented in the criteria. In devising tests for a primary ability, it 
is the primary factor that serves as a criterion. A perfect test of a 
primary factor should have zero saturation on all but one of the pri- 
mary reference traits, its specificity should vanish to assure the ab- 
sence of unknown factors, and its reliability should be high. As long 
as the specificity of a test is appreciable, it involves unknown factors. 
So far we have not succeeded in devising tests with more than half of 
the total variance on a single primary factor, but this corresponds to 
a validity-correlation of about .70, which is quite satisfactory accord- 
ing to customary test standards. The experimental test battery should 
not be used as a service instrument. It is intended for those who want 
to experiment with tests for the primary mental abilities. A new bat- 
tery for eighth-grade children has recently been completed with a 
number of test improvements which will be described in a separate 
publication. 

This study was subsidized by the Carnegie Corporation through 
The Carnegie Foundation for the Advancement of Teaching. The 
Corporation and the Foundation do not assume any editorial respon- 
sibility in the publication of research under these grants. 

The author is indebted to the principal at Hyde Park High School, 
Mr. Joseph F. Gonnelly, for his interest in these investigations, and 
to Miss Mae K. Kernan, the school psychologist. Special acknowledg- 
ments are also due the Superintendent of Schools, Dr. William H. 
Johnson, and the director of the Bureau of Child Study, Dr. Grace 
Munson for their support of these studies in the Chicago schools. Mr. 
M. P. Nelson, of the North Park College, made the initial arrange- 
ments for this particular study. 
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TABLE 1 
Rotated Factorial Matrix 
TIME 
TESTS LIMITS ag N W Vv s M I D 9 10 11 
1. Addition 1-7 —.034 542 -086 —.043 —.025 —.014 074 —.015 —.047 -034 —.187 
2. Areas 3-8 202 .065 .275 —.050 —.005 .089 .186 .160 —.013 —.067 —.005 
3. Arithmetic 2-20 .027 .884 —.019 -057 005 .084 125 .489 —.085 —.066  .155 
4. Cards 6-12 072 —.038 011 —.040 .630 024 .020 -190 —.072 -011 —.065 
5. Completion 2-5 180 —.013 .069 .580 .088 —.011 —.070 .094 —.059 —.105 075 
6. Directions 2-7 068 .059 .156 .420—.041 .055 .2938 .231 —.018 .126 .045 
7. Disarranged Words 2-8 030 .0387 .421 .285 —.053 —.086 —.004 —.028 .080 .C27 .056 
8. Figure Grouping 5-6 108 .099 .190 .004 .097 .006 .810 —.011 —.056 .162 —.103 
9. Figures 5-11 —.005 —.027 .008 .072 -694 —.050 057 .025 .027 -083 —.075 
10. Identical Forms 8-8 -582 —.036 -209 =.014 -050 —.0386 .170 —-.036 007 —.010 .182 
11. Identical Numbers 1-3 -289 # .050 .007 -083 —.034 -048 011 —.100 464 182 = .159 
12. Initials Recall 8-2-7-7 —.019 —.045 .038 .018 —.049 .588 —.022 —.024 —.008 —.079 176 
18. Letter Grouping 10-8 066 .005 -256 309 § .012 —.076 856 —.080 .097 -064 —.150 
14. Letter Series 4-9 -107 .074 -336 -163 —.061 071 886 .164 —.032 -221 —.018 
15. Marks 13-8 —.030 088 .269 .051 021 077 430 —.044 001 —.071 .092 
16. Mechanical 

Movements 9-15 .054 —.014 —.020 —.045 .165 021 —.053 465 -024 —.239 —.092 
17. Mirror Reading 2-7 —.020 -009 423 =.061 -020 075 .050 .184 018 —.018  .076 
18. Multiplication 1-7 089 .617 —.011 .078 —.013 .001 —.048 —.042 —.020 —.036 —.024 
19. Number Patterns 6-8 073 .291 .297 —.037 .092 —.026 .892 .050 —.080 —.044 .271 
20. Number Series 4-10 .114 .273 -074 -056 —.020 -—-.053 256 .467 040 .094 —.026 
21. Patterns 3-4 119 —.065 .289 —.019 .100 0385 .262 .157 022 .256 .108 
22. Proverbs 2-3 -0338 —.065 —.068 652 051 -126 185 .017 —.011 -038 —.078 
23. Pursuit 2-7 -283 .162 —.041 —.045 .205 .081 .262 .197 .073 —.029 —.009 
24. Reasoning I 2-5 —.059 -004 —.023 344 —.049 .078 214 .272 .072 003 .099 
25. Reasoning II 3-7 —.102 —.040 036 .258 —.064 -055 023 .876 —.011 —.061 —.126 
26. Reasoning III 2-7 135 —.009 -032 -349 -054 —.057 089 859 —.017 -256 .008 
27. Repeated Letters 2-8 .280 —.002  .189 .017 .080 .062 .091 —.070 .294 .171 —.061 
28. Same—Opposite (R) 2-5 127 .056 .053 .679 .056 —.010 —.002 .006 —.039 .040 .215 
29. Scattered X’s 2-10 -846 —.042 083 —-.013 .082 —.078 —.080 .070 436 —.034 .066 
30. Spelling 10 —.068 -068 179 .864 —.087 063 —.094 .054 083 .280 .153 
31. Squares 2-6 —.018 -029 009 —.022 -416 —.010 .266 .182 -248 —.057 047 
32. Verbal Analogies 2-7 014 —.074  .215 .408 .026 —.078 .124 .820 —.090 —.054 —.087 
33. Verbal Enumeration 2-6 542 .046 .019 .200 —.052 .026 —.034 —.042 —.002 —.017  .196 
34. Word Number $-2-8-5 .022 —.034 —.023 —.046 .006 .579  .0385 —.043 —.005  .065 —.092 
35. Word Patterns 5-10 102 —.095 .172 -455 —.067 -042 -244 017 049 —.043 .000 
—36. Same—Opposite (W) 2-5 —.216 —.216 .085  .394 —.023 —.044 —.040 .248 .058  .065 —.034 
37. Sex .005 —.308 .175 .186 —.144 .186 .269 —.412 .014 .307 .079 
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TABLE 2 


Correlation Matrix 


1 2 8 4 5 6 7 8 9 10 11 12 138 


.284 .072 .090 .224 .264 .219 .177 .029 .288 .078 .285 
.220 865.839 .819 .880 .341 .250 .294 .882 .190 .187 .345 
.284 .365 858 .429 .527 840 .224 .832 .149 .096 .228 .277 
072 .389 .358 301 .298 .288 .264 .727 .304 .052 .092 .247 
090.319 .429 .301 588 .443 .175 .814 .827 .180 .249 .351 
224 880 .527 .298 .588 425 .259 .280 .277 .190 .218 .479 
264 .341 .840 .2388 .448 .425 252 .814 .818 .868 .237 .414 
219 .250 .224 .264 .175 .259 .252 .298 .276 .184 .089 .346 
77 «=.294 8382 .727 .814 .280 .314 .298 805 .099 .068 .294 
10 .029 .8382 .149 .3804 .827 .277 .3818 .276 .805 342 .188 .307 
11 .288 .190 .096 .052 .180 .190 .368 .184 .099 .342 130 .273 
12 .078 .187 .228 .092 .249 .218 .237 .089 .068 .138 .130 .130 
18 .285 .845 .277 .247 .851 .479 .414 .346 .294 .807 .273 .180 
14 .275 .448 .459 .828 .406 .572 .441 .871 .800 .328 .247 .205 .492 
15 .172 .810 .294 .284 .215 .884 .291 .287 .272 .240 .101 .196 .835 
16 .092 .284 .442 .428 .810 818 .148 .116 .334 .135 -.065 .121 .165 
17 .244 .475 .407 .896 .425 .469 .546 .272 .846 .305 .192 .233 .384 
18 .684 .247 .868 .091 .237 .2438 .888 .165 .186 .141 .421 .142 .275 
19 .279 .874 .488 .880 .272 .404 .889 .806 .379 .353 .171 .184 .391 
20 .304 .879 .624 .850 .871 .542 .843 .810 .807 .224 .212 .102 .397 
21 .084 .291 .268 .826 .244 .890 .251 .295 .327 .3866 .184 .167 .270 
22 .105 .210 .887 .187 .614 .615 .8138 .180 .201 .175 .111 .222 .897 
23.261 .484 .3897 .409 .800 .868 .210 .3824 .413 .852 .257 .161 .328 
24 .070 .188 .447 .153 .403 .507 .197 .112 .154 .095 .046 .165 .258 
25 .064 .155 .412 .166 .3895 .443 .251 .090 .131 .035 —099 .130 .233 
26 .049 .223 .891 .289 .480 .559 .277 .141 .242 .203 .072 .064 .300 
27 821 .887 .120 .178 .227 .206 .3861 .3827 .3806 .401 .500 .128 .363 
28 .124 .272 .425 .209 .754 .627 .482 .163 .248 .271 .192 .245 .875 
29 .215 .887 .193 .201 .207 .152 .842 .155 .226 .398 .507 .063 .284 
30 .181 .160 .292 .020 .393 .874 .3820 .098 .053 .032 .212 .208 .241 
31 .227 .864 .415 .519 .273 .3843 .227 .294 .577 .3820 .240 .134 .339 
82 .068 .354 .485 .3848 .620 .604 .369 .195 .293 .241 -.051 .170 .3538 
83 .047 .252 .128 .141 .3849 .244 .248 .083 .122 .481 .312 .199 .227 
84 .165 .221 .1387 .122 .126 .189 .1386 .106 .100 .120 .185 .449 .142 
35 .096 .309 .3852 .201 .5381 .529 .411 .216 .216 .267 .196 .225 .435 
-36 -.087 .118 .185 .062 .324 .3863 .123 -.030 .050 -.046 -.091 .066 .151 
37 -.219 -.058 ~—.290 -.202 -.152 -.024 .006 .026 -.195 .117 .056 .134 .032 
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TABLE 2 (continued) 


Correlation Matrix 
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244 
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396 
425 
-469 
546 
212 
346 
805 
192 
23 

384 
481 
853 
411 


224 
48 

410 
398 
326 
362 
291 
294 
323 
.306 
378 
275 
326 
379 
426 
155 
203 
373 
.204 - 


18 


684 
247 
368 
091 
237 
-243 
165 
186 
141 


327 
316 
044 
199 
312 
108 
.049 
076 
352 
296 
287 
293 
232 
.116 
186 
206 
174 
.086 


19 


279 
374 
438 
380 
272 
404 
339 
306 
379 
353 
171 
184 
391 
507 
418 
245 
433 
327 


425 
339 
229 
392 
256 
162 
256 
275 
294 
237 
148 
528 
354 
247 
142 
256 
059 


20 


304 
379 
624 
350 
2371 
542 
343 
310 
307 
224 
212 
102 
397 
512 
305 
364 
410 
316 
425 


322 
325 
451 
375 
369 
-450 
219 
345 
229 
217 
440 
485 
117 
119 
375 
205 


21 


084 
291 
268 
326 
.244 
-390 
251 
295 
327 
366 
184 
.167 
.270 
439 
316 
196 
398 
044 
339 
322 


156 
317 
212 
146 
-286 
305 
204 
180 
159 
365 
370 
118 
109 
306 
082 


872 -—.161 -—.193 -.037 -.249 -.018 


22 


-105 
210 
387 
187 
614 
615 
313 
.180 
.201 
175 
sad 
222 
397 
432 
230 
198 
326 
199 
229 
325 
156 


265 
470 
377 
446 
176 
676 
044 
365 
203 
530 
235 
205 
545 
357 — 
-076 -.164 -.014 -.147 


.280 
013 


24 


.070 
.188 
447 
153 
403 
507 
197 
112 
154 
.095 
.046 
165 
258 
386 
.249 
278 
291 
108 
256 
375 
212 
470 
214 


368 
355 
083 
451 
030 
.261 
.207 
493 
-100 
059 
391 
332 


25 


.064 
155 
412 
.166 
395 
448 
251 
090 
131 
035 
-.099 
130 
.233 
314 
150 
332 
294 
049 
162 
369 
146 
377 
.126 
368 


391 
023 
326 
.018 
198 
197 
491 
026 
097 
291 
386 
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TABLE 2 (continued) 


Correlation Matrix 








26 


.049 
.223 
391 
.289 
430 
559 
277 
141 
242 
203 
072 
.064 
.300 
447 
188 
295 
323 
.076 
256 
450 
.286 
446 
237 
355 
391 


-102 
A497 
118 
346 
258 
5382 
243 
078 
386 


356 — 


-.006 


27 


321 
337 
120 
178 
227 
206 
361 
327 
306 
401 
500 
128 
363 
356 
-208 
097 
306 
352 
275 
.219 
305 
176 
377 
083 
023 
102 


.180 
513 
.229 
322 
130 
293 
192 
191 
085 
.093 


28 


124 
272 
425 
.209 
-754 
627 
432 
168 
.248 
271 
192 
-245 
375 
407 
.203 
198 
378 
.296 
294 
345 
.204 
676 
244 
451 
326 
497 
180 


197 
481 
.210 
544 
392 
182 
533 


374 - 
003 - 


29 


215 
337 
193 
201 
.207 
152 
342 
155 
.226 
-398 
507 
.063 
.284 
247 
131 
122 
275 
.287 
237 
.229 
180 
.044 
307 
.030 
.018 
118 
513 
197 


103 
335 
123 
317 
116 
196 
048 
082 


30 


181 
.160 
292 
.020 
393 
374 
320 
08 
053 
-032 
212 
.208 
241 
289 
.098 
005 
326 
298 
148 
217 
159 
865 
086 
261 
198 
346 
229 
481 
103 


039 
330 
199 
111 
307 
-290 


31 = 82 


227 = .068 
364 .354 
415.485 
519 .348 
273 =.620 
343.604 
227 = .869 
294 .195 
577 = .298 
320 .241 
.240 -.051 
134 .170 
839 353 
381 .497 
353  .277 
392 .405 
379 .426 
.282 .116 
528 .354 
440.485 
365 .370 
203 = .530 
506 .298 
207 .493 
197.491 
253 = .532 
322 .130 
210 .544 
835 = .123 
039 .330 


366 


366 

041 .196 
146 .103 
.238 © .525 
.089 .492 -.001 -.023 
.088 


117 -.187 -.066 


33 


047 
252 
128 
141 
349 
244 
248 
.088 
122 
481 
312 
199 
227 
219 
129 
052 
155 
186 
247 
ALT 
118 
235 
.238 
100 
026 
248 
293 
392 
317 
199 
041 


196 


.080 
277 


124 


34 
165 
221 
137 
122 
126 
.189 
136 
106 
100 
120 
185 
449 
142 
.236 
168 
017 
.203 
.206 
142 
119 
109 
205 
179 
.059 
.097 
073 
192 
132 
116 
at 
146 
103 
080 


178 


165 


385 —36 37 


.096 -.087 -.219 
809 .118 -.058 
852 .185 -.290 
.201 .062 -.202 
581 .824 -.152 
529 .363 -.024 
411 .123 .006 
.216 -.030 .026 
216 .050 -.195 
.267 -.046 .117 
.196 -.091 .056 
.225 .066 .134 
485 .151 .082 
A7T7 .219 .080 
341 .063 .029 
226 .207 -.372 
3873 .204 -.161 
.174 -.086 -.193 
.256 .059 -.037 
3875 .205 -.249 
806 .082 -.013 
545 .357 —.076 
.280 -.013 -.164 
391 .332 -.014 
.291 .386 —.147 
386 .356 -.006 
191 -.085 .093 
533.874 = .003 
.196 —.043 -.082 
307 .290 .117 
.238 = .089 -.187 
525 .492 —.066 
277 -.001 .124 
173 -.023 .088 


.233 .096 


.233 032 
096 .032 
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TABLE 3 


Centroid Factor Matrix 








Z {ie fe 5g 


3869 .283 -.146 .432 
570 .185 .095 -.036 
652 -.193 .196 .244 
520 .171 .449 —.258 
684 -.341 -—.221 -.218 
-754 -.295 -.011 .096 
606 .089 -.193 -.092 
408 .217 .101 .101 
5381 .241 .340 -.234 
479 .326 -.071 -—.344 
363 .392 -.415 -.072 
333 .052 —.180 .160 
603 .083 -.073 .064 
720 .015 .073 .132 
475 .184 .155 .142 
428 -.188 .378 -.127 
667 .056 .108 -—.069 
446 .249 -.336 .354 
601 .220 .208 .127 
661 -.099 .227 .179 
492 124 .213 -.092 
604 —.404 -.227 .065 
580 .236 .174 .038 
492 -.388 .054 .117 
426 -—.427 .180 .058 
555 —.358 .061 —.092 
484 .448 -.205 -.085 
671 -—.378 —.343 -.124 
418 .371 —.178 —.286 
421 -.217 -.301 .031 
587 .229 .3872 -.039 
676 —.424 .158 -—.121 
381 .100 -—.344 —.259 
291 .179 -.160 .241 
617 -.202 -—.142 -.028 
.285 -.519 .061 —.096 


-.091 .062 -.283 .016 


Y Wi Vil Vili AX XK wa 


3842 -109 -.141 -.170 -.122 -.105 -.094 
-.085 -.043 -.096 .183 -.059 -.117 -.069 
224 .125 -028 .142 -.157 -.083 .116 
100 .265 -—114 -.133 .0387 .164 -.125 
-.035 .180 -.075 -.084 -.064 -.053 -.064 
-.106 -.053 .101 -.015 -.014 .045 -.009 
-.049 —.209 —.216 -—.088 -—.065 -.082 .099 
-.079 -—.095 .096 -.099 -—.022 .069 —.135 
155.239 -—.070 -.307 .107 .152 -.065 
—.212 .085 .227 .136 -.183 -.060 -.108 
198 -.150 .202 .105 .082 .028 .179 
—.236 .266 -.268 .270 .146 .106 .138 
-.100 -.187 .119 -.194 .110 -.100 -.104 
-.177 -.210 .076 .012 -.043 .074 ~.053 
-.257 .022 .072 -.072 .056 -.118 .097 
153.106 -.150 .175 .050 -.179 -.087 
-.096 —.132 —278 .044 -.048 -.059 .114 
412 .058 -.123 -.144 —202 -.126 -.019 
-.117 .099 .083 -.092 -—.201 -.101 .184 
-196 —132 .121 .099 -.102 -—.057 -—.034 
-.172 -.147 .087 .051 -.054 .181 .053 
-.107 .132 .096 -.133 .165 .063 -.126 
099 .151 .176 .087 .017 -.051 -—.107 
-.037 .088 .112 .050 .083 -.018 .109 
.032 -.068 -.126 .087 .085 -.072 -.064 
047 -.101 .123 .027 -.078 .189 -.080 
094 -.175 .082 .046 .105 .038 -.047 
-.049 .172 .028 -.160 -.107 .067 .073 
.219 -—.147 .086 .164 .066 -.169 .078 
.024 -.161 -.133 -.054 -.096 .223 .175 
156 .120 .102 -—.089 .181 -.056 .131 
-.110 -.048 -—.071 -.038 -.026 -.081 -.075 
-.077 .161 .167 .152 -.196 -.025 -.095 
-.144 .142 -.209 .225 .227 .191 -—.090 
-.207 -.017 .080 -.045 .095 -.101 -.042 
-.037 —.152 -.103 -.040 .116 .052 .072 
-A9T —129 .189 -.056 .143 .276 .041 


h2 


637 
427 
692 
-723 
-738 
692 
542 
292 
729 
.630 
616 
494 
510 
.633 
382 
491 
576 
-765 
592 
625 
All 
687 
509 
446 
430 
525 
547 
806 
569 
453 
647 
-710 
471 
432 
515 
422 
493 
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TABLE 4 


Direction Cosines 4,,, of Primary Reference Vectors A, 








P xs - S M I D 9 10 911 


I 177°» «109 «=.222) «818 «=.124 096 .284 .225 .079 .066 .052 
II .234 .1380 .190 -.565 .203 .110 .124 -.369 .130 .039 .018 
III -.246 -.087 .080 —481 .343 -.1388 .277 .462 —.146 -.043 -.082 
IV -.341 .567 -.071 -.176 -.333 .3854 .448 .000 -.242 .067 -.121 
V 017 .3867 -.552 -.067 .294 -.243 -.476 .427 .351 .048 —.158 
VI 145 .164 -.5386 .048 .441 .354 -101 -.144 -.240 —486 .278 
VII 480 .024 -.333 .164 -.048 -303 .596 .004 .269 .219 .185 
VII 448 -.086 -.198 -—.480 -.452 .5380 -.201 .609 .178 -.040 .160 
IX -.379 -.681 -.334 .281 .275 .877 .083 -.149 .639 —107 -.412 
X -.055 —.123 -.157 -002 .887 .3867 -.114 .048 -—112 .857 .068 
XI -.428 .000 .159 -.028 -.045 -.004 .004 -.030 .488 .000 .800 





TABLE 5 


Cosines of Angular Separations Between Reference Vectors A, 








P N W V S M I D 9 10 11 


P 1.000 

N_ .132 1.000 

W -.164 -.029 1.000 

V -.112 -.250 -.090 1.000 

S -—162 -.179 -.836 .091 1.000 

M -.051 -.155 -.238 -.214 -.025 1.000 

I-.0384 .083 .215 .033 -.144 -.061 1.000 

D 168 186 -.225 -302 -.125 .044 -—191 .999 

9-094 -447 -.268 .125 .083 -.003 -.065 .083 1.001 

10 .016 -.026 .057 .015 111 .061 .092 .093 .009 1.000 
11 073 .192 .115 -.072 -.107 .007 .023 .000 .083 .007 1.001 





TABLE 6 


Correlations of Primary Factors 





P N W V S M I D 





.108 1.000 

312 ~=.889 1.000 

144 .294 .314 1.000 

258 .263 .475 .160 1.000 

200 ©3864 8 =©.448 = 857 ~~ .261 1.000 

006 -.105 -.108 -.018 .108  .003 1.000 
-.029 -.026 .257 .827 .284 .120 .184 1.000 
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TABLE 7 
First Factor Residuals 








rr 


911 
-.007 
093 
-.012 
078 
025 
006 
-.124 


UO RaSZz 


N 


851 
105 
093 
030 
136 
—.105 
—.148 


Ww V S M I D 
458 
-.072 .725 


031 -.156  .636 

009 .048 -.095 .652 

-.107 -.013 .108 .003 1.000 

022 .160 .042 -.068 .184 .898 





TABLE 8 


Saturations of Primaries in Second-Order General Factor 








Ww V Ss M I D 


-7386 §=6©.524—S .608)=— 590) 000 ~—«.819 
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A COMPUTATIONAL PROCEDURE FOR THE METHOD 
OF PRINCIPAL COMPONENTS 


MERRILL M. FLOOD 
PRINCETON UNIVERSITY 


An n-rowed correlation matrix may be thought of as an ellip- 
soid in n-dimensional space with its center at the origin. The prin- 
cipal components of the matrix are essentially the semi-axes of the 
ellipsoid. A direct and simple method of computing the lengths and 
directions of these semi-axes is presented. 


This paper makes no attempt to present a comparison of the 
many computational methods which have been proposed for use in the 
application of the method of principal components. Hotelling intro- 
duced (1) this method of statistical analysis and has offered a simpli- 
fied computational method (2). Other writers, including Frisch (3), 
Burt (4), and Thurstone (5), have considered the general problem of 
“factor analysis,” have discussed the method of principal components 
in its mathematical, computational, and statistical aspects, and have 
compared this method with others suggested for use in applied prob- 
lems of “factor analysis.” Only the question of computational proce- 
dure for the method of principal components is considered here. 

Stated briefly, the problem with which we are concerned is the 
practical determination of the latent roots and invariant vectors* of 
a matrix M. In the particular case of statistical importance the ma- 
trix M is the real, symmetric, positive definite correlation matrix R of 
order v. The latent roots x; , and invariant vectors d; are the solu- 
tions of the matrix equation 


(R—x)d=0, (1) 


where x is a scalar matrix and d is a matrix (vector) with only one , 
column. 

The characteristic function of R is the scalar polynomial f(x) de- 
fined by 


f(#) = |e—R| =H (#—2,) =2°—S aa. (2) 


h=0 


* For the matric terminology, notation, and theorems used in this paper, see 
Wedderburn’s Lectures on Matrices (6), particularly Chaps. 1, 2, 3, and 6. An 
elegant presentation of the method of principal components, using the algebra of 
matrices, has been given by Householder and Young (7). 
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It is well known that R satisfies its characteristic equation, that is 


{(R) = — Zak =0. (3) 
If the roots x; of R are distinct, then f(x) is the only polynomial of 
degree v, and leading coefficient unity, having R as a matric root. 
This characteristic function may be computed, therefore, by solving 
the v? linear equations of (3) above, for the values of the v coefficients 
a@,. The roots x; may then be found by applying Horner’s method, for 
example, to the polynomial f(x). Finally, the components of each in- 
variant vector d; may be computed by solving the set of v linear homo- 
geneous equations 


(R—2x;)d,;=0. (4) 


This completes the mathematical exposition of the computational 
method proposed in this paper for the practical determination of the 
latent roots and invariant vectors of a correlation matrix. 

An example will illustrate the method of computation. For this 
purpose, I shall analyze the matrix R* below, which was used by Ho- 
telling (2) to illustrate his simplified method for the calculation of 
latent roots and invariant vectors (principal components). 


1. 9596 .7686 .5427 
=a 9596 1. 8647  .7005 
ops 7686 .8647 1. 8230 (5) 
5427 .7005 #£.8230 «1. 


The calculations may be simplified at the outset by considering the 
matrix R = R* — 1 which has the same invariant vectors as R* but 
latent roots each reduced by unity. 

Of the v? linear equations (3) it is only necessary to use v, say 
those shown as the first column of f(R). These equations may be 
found without actually raising R to successive powers since only the 
first column of each power is required; thus R’e, = R(Re,), R®e, = 
R(R’e,), and so on. The result is the set of linear equations 


e, Re, R?e, Re, Re, 
1 0 1.80610141 2.69170038 ]| 4a, 7.68515651 (6) 
0 .9596 1.04476977 3.75082780 a, |__| 7.42815763 
0 .7686 1.27640822 3.36539747 Ay 7.58573273 
0 .5427 1.30475760 2.76251642 Qs 6.85796278 


Actually, it is unnecessary to include R*e, , for a, must be zero since 
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the trace of R is zero. This leaves three equations to be solved for 
the coefficients a, , a, , and a, ; the complete solution being 


Qs + 0.96314191 
Q, __ | + 38.68871936 
A) ~ | + 8.72183675 
as | 0 





The greatest root of the polynomial 
f(x) = 2 —a,2? — a,% — ay 


is found to be x, = 2.33972. The invariant vector d, corresponding to 
the root x, is determined by the equations 


R—%, d, 
—2.33972 .9596 -7686 +.5427 7 [f dy 
9596 —2.33972 .8647 +.7005 ds, “é 
-7686 8647 —2.33972 -+.8230 ds, Pins 
5427 .7005 .8230 —2.33972 | be 
A solution of these four linear homogeneous equations is found to be 
dy, | .930447 
do, __ | 1.000000 
ds, | — 977399 
da, ! _ 858967 


These components of d, agree, to four decimal places, with those 
found by Hotelling, and the value 1 + «, is the same as that found by 
Hotelling for R*. The other latent roots and invariant vectors may be 
found in similar fashion. 
A number of questions arise concerning this method of compu- 
tation; for example, under what conditions: 
1. Is it faster than other methods? 
2. Is the matrix (e, Re, --- R’e,) non-singular? 
3. Are the results accurate to the required number of 
decimal places? 
There is the further question concerning the case in which some of 
the roots are equal, or nearly equal. I hope to examine these ques- 
tions in another paper and extend the method for use in analyzing any 
normal* matrix M. 


* A matrix M is said to be “normal” if it commutes with its transpose. 
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